Resources

For all group collaboration, we’ll be using the dotData Slack Channel. All you need to sign up is an @wisc.edu address.

Internships

Research Internships — National Science Foundation
Do paid research over the summer! They have computation-focused opportunities and more!


SAS Resources:

Rachel from SAS: rkwon@wisc.edu. – Videos from presentation: SAS Inside OutSAS Building VideoFundraiser Video — SAS E-Learning – Access code: G70007601 — Free University Edition software: www.sas.com/universityedition — Video tutorials – https://video.sas.com/category/videos/sas-studioFree Book on Data Science — Apply to Internships on Handshake or here

Projects/Personal

Group Projects
(You’ll need to join the Slack Channel to get involved with these!)

Project: Social Network-based Music Predictions—#music-social-network on Slack – Build a music recommender system that uses social information. Essentially, he want to build a social network for music discovery here people can discover music they like based on their friends circles.

Project: Mining Twitter for Sentiments toward Vaccination—#twitter-data-project on Slack – A PhD student in the club is working on a project! Collect Twitter data and find out follower relationships based on tweets about vaccinating. Once a data scraper is assembled, it will stream data for 3 months continuously! She has a scraping script but needs assistance with both cleaning the data and actually making the script run for the 3 month period.

Genetic Algorithms:
Genetic Algorithms Prezi – Click here for a Genetic Algorithms Tutorial – Tetris Example: code and live example (type in tetris on this page once it loads) – If you have more questions about Genetic Algorithms, email Matthew at: matthew.wolff@wisc.edu.

ML in Python:
We had Dr. Finn Kuusisto give a talk on Machine Learning in Python, complete with code examples and the biological applications he uses in his work. Look over his slides here. If you’d like to ask him any questions about his work or about his career in general, he welcomes questions! Email him at fkuusisto@morgridge.org.

Networking and Résumé:
We had a few members speak about their experience with research, how networking played a role, and how they initially reached out to the Primary Investigators. For tips on your résumé + getting into research (with sample emails), check out the slides.

Academics

Coming in Fall 2020, UW-Madison will officially start offering a Data Science Major. The curriculum for this major is currently under development, and a slew of new courses will actually be created for it! If you’re not an underclassmen, this is sadly too late for you to look. That said, here are some recommendations for courses to look into:

CS 301: Data Programming [Python], 3 credits — this is the only for sure course that I’m aware of. It’s part 1 of 2 for their programming sequence.

CS 368: Topics in Computing [R, MATLAB, C++], 1 credit — this offering changes per semester, but these languages are invaluable. R is used for most stats courses, our databases course uses C++, and many engineering courses use MATLAB.

CS 532: Matrix Methods in Machine Learning, 3 credits — NOTE: unlikely this will be part of the Data Science major, but it’s a very good applied course. I was briefly in it before switching into a different course. Seemed interesting. It’s a flipped classroom, so you watch some videos outside, and then work on homework in-class. May or may not be reserved for graduate students next spring? Unsure, but that would be odd.
– Pre-Requisites: MATH 222 and (E C E 203, COMP SCI 200, 300 or 302) or graduate/professional standing

Math 234: Calculus III, 4 credits — Easier than Calculus 2, pre-requisite for Foundational courses like the Stats 309/310 sequence (Mathematical Statistics and probability)
– Pre-Requisites: MATH 222

Math 340: Linear Algebra, 3 credits — lovely course. Avoid its sibling course, Math 320, if you don’t like differential equations.
– Pre-Requisites: MATH 222

Stats 301 or 302: Intro to Statistics, 3 credits — very likely be a pre-requisite for other courses required Stats courses
– Pre-Requisites: None for STAT 301; MATH 221 in order to take STAT 302

Stats 327: R, 1 credit — This will just be a useful language, whether for research or other classes. They have intro, intermediate, advanced, all in one semester. If you’re more CS-focused, you could jump for CS 368 – R (caveat, you’ll need to get a credit substitution for 327)

Stats 333: Applied Regression Analysis, 3 credits — honestly wish I could have taken this. Only open to stats majors, it seems.
– Pre-Requisites: (STAT 224, STAT 301, STAT 302, STAT 312, STAT 324, or STAT 371) and STAT 327 or concurrent enrollment