What & Why?
Data
- Data analytics is the process of examining datasets to find trends, answer questions, and draw insights that drive business decisions.
- It’s valuable because modern businesses generate HUGE amounts of data. Structuring and analyzing such data efficiently can be a huge competitive advantage
Ingesting Data
- Start with ingesting data - getting data into the cloud
- There are different data sources u can ingest data
- Applications
- Crawlers & Scheduled tasks
- Devices & sensors
- temperature, movement speeds, etc
- ex) automobile company
- high frequency data, streaming
- AWS Kinesis
- Manual data entry
Ingestion Frequency
- All these courses produce data at different frequencies (The frequency matters)
- slow frequency data
- manual data entry
- crawling (ex. every hour)
- moderate frequency
- user orders
- still not overwhelming
- high frequency data
- ex. website logs, sensor data
- more difficult to process & store because it easily overwhelms your systems
- too much data coming in at the same time could shut down your servers
- slow frequency data
Services
- high frequency data ingestion
- Storing Data meant to be analyzed
- AWS Data Lakes & Warehouses
- Redshift - storing & analyzing data
- Data processing & transformation
- Glue - transforming & loading data
- EMR (Elastic Map Reduce) - self managed big data computation (alternative to Glue)
- Data query & Analyzing data
- Searching / visualization