#1
19th June 2015, 03:23 PM
| |||
| |||
CSVTU Notes
Bcoz of some reason I was attend Chhattisgarh Swami Vivekanand Technical University, Bhilai Data Mining and Warehousing subject classes, now exams coming soon, so I need some short notes on Data Mining and Warehousing, please provide here???
|
#2
19th June 2015, 03:54 PM
| |||
| |||
Re: CSVTU Notes
You are looking for Chhattisgarh Swami Vivekanand Technical University, Bhilai Data Mining and Warehousing subject short notes, here I am giving: Data Integration: Even if you are a small Credit Union, I bet your enterprise data flows through and lives in A variety of in-house and external systems. You want to ask questions that represent those Slices of key information (referred to as Key Performance Indicators or KPIs) such as - What is the member profitability or member value attrition? Oh, by the way, you want to Be able to analyze it across all products by location, time and channel. You realize that all The required data is probably there but not integrated and organized in a way for you to Get the answers easily. Perhaps your IT staff has been providing the reports you need every time through a series Of manual and automated steps of stripping or extracted the data from one source, sorting/ merging with data from other sources, manually scrubbing and enriching the data and then running reports against it. CSVTU Data Mining and Warehousing notes Data Mining and Warehousing Unit-1 Overview and Concepts Need For data warehousing: The Need for Data Ware housing is as follows Data Integration: Even if you are a small Credit Union, I bet your enterprise data flows through and lives in A variety of in-house and external systems. You want to ask questions that represent those Slices of key information (referred to as Key Performance Indicators or KPIs) such as - What is the member profitability or member value attrition? Oh, by the way, you want to Be able to analyze it across all products by location, time and channel. You realize that all The required data is probably there but not integrated and organized in a way for you to Get the answers easily. Perhaps your IT staff has been providing the reports you need every time through a series Of manual and automated steps of stripping or extracted the data from one source, sorting/ merging with data from other sources, manually scrubbing and enriching the data and then running reports against it. You wonder there ought to be a better and reliable way of doing this.Data Warehouse serves not only as a repository for historical data but also as an excellent Data integration platform. The data in the data warehouse is integrated, subject oriented, Time-variant and non-volatile to enable you to get a 360° view of your organization. Advanced Reporting & Analysis The data warehouse is designed specifically to support querying, reporting and analysis Tasks. The data model is flattened (denormalized) and structured by subject areas to make It easier for users to get even complex summarized information with a relatively simple Query and perform multi-dimensional analysis. This has two powerful benefits – multilevel Trend analysis and end-user empowerment. Multi-level trend analysis provides the ability to analyze key trends at every level across Several different dimensions, e.g., Organization, Product, Location, Channel and Time, And hierarchies within them. Most reporting, data analysis, and visualization tools take Advantage of the underlying data model to provide powerful capabilities such as drilldown, Roll-up, drill-across and various ways of slicing and dicing data. The flattened data model makes it much easier for users to understand the data and write Queries rather than work with potentially several hundreds of tables and write long Queries with complex table joins and clauses. Knowledge Discovery and Decision Support Knowledge discovery and data mining (KDD) is the automatic extraction of non-obvious Hidden knowledge from large volumes of data. For example, Classification models could Be used to classify members into low, medium and high lifetime value. Instead of coming Up with a one-size-fits-all product, the membership can be divided into different clusters Based on member profile using Clustering models, and products could be customized for Each cluster. Affinity groupings could be used to identify better product bundling Strategies. These KDD applications use various statistical and data mining techniques and rely on Subject oriented, summarized, cleansed and “de-noised” data which a well designed data Warehouse can readily provide. The data warehouse also enables an Executive Information System (EIS). Executives Typically could not be expected to sift through several different reports trying to get a Holistic picture of the organization’s performance and make decisions. They need the KPIs delivered to them. Some of these KPIs may require cross product or cross departmental analysis, which may Be too manually intensive, if not impossible, to perform on raw data from operational systems. This is especially relevant to relationship marketing and profitability analysis the data in data warehouse is already prepared and structured to support this kind of analysis. Performance Finally, the performance of transactional systems and query response time make the case For a data warehouse. The transactional systems are meant to do just that – perform Transactions efficiently – and hence, are designed to optimize frequent database reads and Writes. The data warehouse, on the other hand, is designed to optimize frequent complex Querying and analysis. Some of the ad-hoc queries and interactive analysis, which could Be performed in few seconds to minutes on a data warehouse could take a heavy toll on The transactional systems and literally drag their performance down. Holding historical data in transactional systems for longer period of time could also Interfere with their performance. Hence, the historical data needs to find its place in the Data warehouse. Trends in Data Ware Housing: Industries experience with data warehousing over the last decade has provided important lessons on what works in today’s business intelligence (BI) solutions. It is not only these lessons, but also the emerging trends which are also shaping our industry directions in business solutions. As a result, our emerging reference architectures used in building these enterprise data warehouse solutions are changing to meet business demands. This evolving reference architecture used in building solutions will be overviewed, followed by the implications of these changes. It is these evolving reference architectures that are putting new demands on the databases that are used in warehousing. An important point is that although many of these concepts are not new, databases are being pushed in new ways which are requiring further technology invention. With the emergence and evolution of the intranet, as well as more businesses exploiting semi-structured data, the more traditional business models are evolving with respect to such things as data accessibility, delivery, and concurrency. Technology such as XML and web services becomes more critical as databases integrate with web portals and BI tooling. Moreover, additional demands on more broad decision making within enterprises are causing heavy consolidation and nontraditional mixed workloads (heavily mixing OLTP and DSS) beyond what has been conventional in the past. Service level agreements, as well as normal operational characteristics are not the same (e.g., backups). Moreover, in many case consolidation is not an option and or desired. In such latter cases, the business question still needs to be run. As a result, federation augmentation is also very real in enterprise systems. Query management in a federated environment is still a challenging task. A combination of consolidation and federation augmentation is being seen. In addition to heavy consolidation and federation augmentation, both real-time (right-time) and active data warehousing systems For detailed notes here is attachment: |
|