question archive Why are the original/raw data not readily usable by analytics tasks? What are the main data preprocessing steps? List and explain their importance in analytics
Subject:Computer SciencePrice:3.87 Bought7
Why are the original/raw data not readily usable by analytics tasks? What are the main data preprocessing steps? List and explain their importance in analytics.
Answer:
1.
Why are the original/raw data not readily usable by analytics tasks?
The main reason why original / raw data is not usable by analytics its because raw data is usually dirty, misaligned, inaccurate and overly complex. Data processing and cleansing is necessary in order to feed data mining models with clean data.
The main challenges of using raw data in analytics tasks are:
· Data is never static - as I have mentioned earlier data should undergo data cleansing to remove duplicates and to properly structure the data to be used in data mining processes
· Incorrect Data being analyzed can result to bad strategic decisions
· Development and utilization of a data cleansing framework. To ensure that the right data is used in the right time and to maximize the value of the data being analyzed it is recommended to use data cleansing frameworks.
· Big Data can bring Big Problems - unstructured and uncleaned data can cause a negative impact instead of benefits to the organization.
2.
What are the main data preprocessing steps? List and explain their importance in analytics.
The main data preprocessing steps are outlined below: