Working with Open Data
Untitled document
Course Outline
Last revised: 2013.05.27
- Day 1: Introduction to Working with Open Data
- Day 2: Review of Python and iPython (Chap 3 and Chap 13)
- Day 3: Getting started with pandas and NumPy I
- Day 4: Getting started with pandas and NumPy II
- Day 5: Pandas Cont'd / Introduction to Matplotlib / Projects
- Day 6: Project Brainstorming and Team Formation
- Day 7: Preparing for Courtlistener / Branching out to the Wikipedia, Freebase / Projects...
- Day 8: Courtlistener
- Day 9: Reflection: Courtlistener, Homework, Projects, Knight Challenge
- Day 10: Pip, Fixed Column Data, Freebase
- Day 11: Project Proposals and Pandas!
- Day 12: UC Data Lab and Data for our projects
- Day 13: Project Proposals and Continuing to Learn PfDA
- Day 14: Time, Space, and Baby Names
- Day 15: Midterm Preparation / Cloud Computing Teaser / Study Hall
- Day 16: PiCloud / AWS / CommonCrawl
- Day 17: Mid-term Exam
- Day 18: CommonCrawl
- Day 19: PiCloud, CommonCrawl revisited, Projects
- Day 20: Guest Speaker: Fernando Perez
- Day 21: The last stretch
- Day 22: Finishing up PiCloud work and preview of Juriscraper workshop
- Day 23: Projects
- Day 24: Projects II
- Day 25: Projects III
- Day 26: Guest Speaker: Eric Kansa on Open Context
- Day 27: Project Presentations I
- Day 28: Project Presentations II
- Day 29: Open Data Project Exhibit
Old plans:
- Review of Python & iPython (Chap 3 and Chap 13) + setup EPD
- NumPy and Getting started with pandas I (Chap 4, 5)
- Open Government Data I: US Census data: introduction how to work with
- Plotting and Visualization (Chap 8)
- Data Loading, Storage, and File Formats (Chap 6)
- Wikipedia: API, data structure
- NumPy and Getting started with pandas II (Chap 4, 5)
- Freebase, dbpedia, wikidata
- Data Wrangling: Clean, Transform, Merge, Reshape (Chap 7)
- working with geodata I
- JavaScript-based visualization I
- Publishing open data including LOD
- Google Refine
- working with geodata II
- JavaScript-based visualization II
- MIDTERM (Day 17, 2013-03-19)
- ????
- Wikipedia II
- Data Aggregation and Group Operations (Ch 9)
- Time Series (Ch 10)
- Financial and Economic Data Applications (Ch 11)
- ????
- Open Government Data II
- Advanced NumPy (Ch 12)
- PyData Misc I / Working on Projects
- PyData Misc II / Working on Projects
- Project Poster Session
To be scheduled:
- open scientific data (w/ guest lectures)
- using licensed UC Data Lab resources
- 1 to 3 more outside speakers
Can we fit other topics?
- open music data, open bibliographic data, etc (- open bibliographic data (e.g, the Harvard dataset,OCLC, HathiTrust))
slide 8/25
* help? contents?