ISO IEC TR 20547-2:2018 download free.Information technology — Big data reference architecture — Part 2: Use cases and derived requirements.
Introduction
ISO IEC TR 20547-2 is focuses on forming a community of Interest from industry, academia, and government, with the goal of developing a consensus list of big data technical considerations across all stakeholders. This included gathering and understanding various examples of use cases from diversified areas (i.e., application domains). To achieve this goal, the following tasks were done:
— gathered input from all stakeholders regarding big data technical considerations;
— analyzed and prioritized a list of challenging use case specific technical considerations that may delay or prevent adoption of big data deployment;
— developed a comprehensive list of generalized big data technical considerations for ISO/IEC 20547- 3, Information technology — Big data reference architecture • Part 3: Reference architecture; and
— documented the findings in ISO IEC TR 20547-2.
1 Scope
ISO IEC TR 20547-2 provides examples of big data use cases with application domains and technical
considerations derived from the contributed use cases.
2 Normative references
The following documents, in whole or in part, are normatively referenced in ISO IEC TR 20547-2 and are indispensable for its application. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (Including any amendments applies.
lSO/IEC 205461nfor,nation technology — Big data — Definition and vocabulary
3 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO/IEC 20546 and the following apply.
ISO and IEC maintain terminological databases for use in standardization at the following addresses:
— Author/company/email: Name, company, and email (If provided) of the person(s) submitting the use case
— Ac(ors/ stakeholders and their roles and responsibIlItIes: Description of the players and their roles in the use case
— Goals: Objectives of the use case
— Use case description: Brief description of the use case
4.2 Current solution
Current solutions describe current approach to processing big data at the hardware and software infrastructure and analytics level.
– Compute (System): Computing component of the data analysis system
— Storage: Storage component of the data analysis system
— Networking: Networking component of the data analysis system
— Software: Software component of the data analysis system
4.3 Big data characteristics
Big data Characteristics describe the properties of the (raw) data Including the four malor ‘V’s of big data.
— Data source: The origin of data, which could be from instruments, Internet of Things. Web. Surveys. Commercial activity, or from simulations. The source(s) can be distributed, centralized. local, or remote.
— Data destination- if data transformed In use case, where the final results end up.
— Volume: The characteristic of datasets that Is most associated with big data. Volume represents the extensive amount of data available for analysis to extract valuable information. The assumption that you can extract the most value by analysing as much of the volume of data as possible was one of the primary drivers for the creation of the new scaling technologies.
— Velocity: The rate of flow at which the data is created, stored, analysed, or visualized. Big data velocity means a large quantity of data needs to be processed in a short amount of time. Dealing with high velocity data is commonly referred to as techniques for streaming data.
— Variety: The need to analyse data from a number of domains and a number of data types. The variety of data was handled through transformations or pre-analytics to extract features that would allow integration with other data. The wider range of data formats, logical models. timescales, and semantics, which is desirous to be used in analytics, complicates the integration of the variety of data. Metadata Is increasingly used to aid In the integration.
— Variability; Changes in data rate, format/structure, semantics, and/or quality that impact the supported application, analytic, or problem. Impacts can Include the need to refactor architectures, Interfaces, processing/algorithms, integration/fusIon, storage, applicability, or use of the data.
4.4 BIg data science
Big data science describes the high level aspects of the data analysis process.
— Veracity and data quality: This covers the completeness and accuracy of the data with respect to semantic content as well as syntactical quality of data (such as presence of missing fields or incorrect values).
— Indexthedata.
— Categorize records (e.g.. sensitive. non-sensitive, privacy data).
— Transform old File formats to modern formats.
— Conduct e-discovery.
— Search and retrieve to respond to special requests..
— Search and retrieve public records by public users.
Hundreds of TUs are stored centrally In commercaI databases supported by custom software and commercial search products.
Future:
Federal agencies possess many distributed data sources, which currently must be transferred to centralized storage. In the Future, those data sources may reside In multiple cloud environments. In this case, physical custody should avoid transferring big data from cloud to cloud or from cloud to data center.
5.2.3 Use case 3: StatistIcal survey response improvement
Application:
Survey costs are increasing as survey responses decline. The goal of this work is to increase the quality
— and reduce the cost — of field surveys by using advanced ‘recommendation system techniques.’ These techniques are open and scientifically oblective, using data mashed up from several sources and also historical survey para-data (I.e.. administrative data about the survey.)
Current approach:
This use case handles about a PU of data coming from surveys and other government administrative sources. Data can be streamed. During the decennial census, approximately 150 million records transmitted as field data are streamed continuously. All data must be both confidential and secure. All processes must be auditable for security and confidentiality as required by various legal statutes. Data quality should be high and statistically checked for accuracy and reliability throughout the collection process. Solution information is described in
Future:
Improved recommendation systems are needed similar to those used in e-commerce (e.g.. similar to the use case £3.3 that reduce costs and improve quality, while providing confidentiality safeguards that are reliable and publicly auditable. Data visualization is useful for data review, operational activity, and general analysis. The system continues to evolve and incorporate important features such as mobile access.
5.2.4 Use case 4: Non-Tradftlonal Data in Statistical Survey Response Improvement
(Adaptive Design)
Application:
Survey costs are Increasing as survey response declines. This use case has goals similar to those of the Statistical Survey Response Improvement use case (see Clause 5.2.3). However, this case Involves nontraditional commercial and public data sources from the web, wireless communication, and electronic transactions mashed up analytically with traditional surveys. The purpose olthe mashup is to improve statistics For small area geographIes and new measures, as well as the timeliness of released statistics.
Current approach:
Data from a range of sources are Integrated Including survey data, other government administrative data, web scrapped data, wireless data, eLransactlon data, possibly social media data, and positioning.
ISO IEC TR 20547-2:2018 download free
Note:
If you can share this website on your Facebook,Twitter or others,I will share more.