Wanted: open data suitable for a data science project!
Every year, we ask our students 2nd bach to do a project. We give them a (big) dataset (preferably in JSON or a bunch of files combined), ask some research question, and ask them to perform data cleaning and exploration related to that question using #Rstats.
I've used most obvious choices, so I'm turning to Fedi to find new, interesting datasets. If you have no idea, sharing helps too!
use #GBIF species occurrence point-data ( ). Also spatial, and combinable: https://geodaten.bayern.de
We also have a wealth of training resources and structured learning modules that may be helpful for your task. GBIF provides access to more than 3.6 billion open access occurrence records, so if you're after big data, you've come to the right place!
https://www.gbif.org/training