Data wrangling
Resources
Audit your data. For guidance, see Building Machine Learning Powered Applications, PDF pages 33-35. It describes levels of data availability, âfrom best-case scenario to most challengingâ: âLabeled data existsâ, âWeakly labeled data existsâ, âUnlabeled data existsâ, âWe need to acquire data.â What level are you at now? What level can you reach?
For more guidance, read the Data Collection + Evaluation chapter in Google's People + AI Guidebook.
Safe Handling Instructions for Missing Data is a thought-provoking 30 minute video. What will you do about missing data?
This pandas pipe video series shows how to transition from ânotebook-styleâ pandas code to clean, production code you'll be proud of! After you write exploratory code, can you clean it up?
Data Cleaning IS Analysis, Not Grunt Work is a great place to start. What are you learning about your data while you wrangle it? See this humorous, depressing example of why data cleaning is necessary and time-consuming:

Last updated
Was this helpful?