For all group collaboration, we’ll be using the dotData Slack Channel. All you need to sign up is an address.

Group Projects
(You’ll need to join the Slack Channel to get involved with these!)
Project: Social Network-based Music Predictions—#music-social-network on Slack
Build a music recommender system that uses social information. Essentially, he want to build a social network for music discovery here people can discover music they like based on their friends circles. 
Project: Mining Twitter for Sentiments toward Vaccination—#twitter-data-project on Slack
A PhD student in the club is working on a project! Collect Twitter data and find out follower relationships based on tweets about vaccinating. Once a data scraper is assembled, it will stream data for 3 months continuously! She has a scraping script but needs assistance with both cleaning the data and actually making the script run for the 3 month period.
Last Meeting (11/08/2018):
Last night we had Dr. Finn Kuusisto give a talk on Machine Learning in Python, complete with code examples and the biological applications he uses in his work. If you’d like to see the slides, you can find them here. If you’d like to ask him any questions about his work or about his career in general, he welcomes questions! Email him at
Incoming Data Science Major & Potential Next Semester Courses:
I’d like to thank you all for your attendance over the semester. With class selection coming up, I think an announcement might be useful for underclassmen:
Coming in Fall 2020, UW-Madison will officially start offering a Data Science Major. The curriculum for this major is currently under development, and a slew of new courses will actually be created for it! If you’re not an underclassmen, this is sadly too late for you to look. That said, here are some recommendations for courses to look into:
• CS 301: Data Programming [Python], 3 credits — this is the only for sure course that I’m aware of. It’s part 1 of 2 for their programming sequence.
CS 368: Topics in Computing [R, MATLAB, C++], 1 credit — this offering changes per semester, but these languages are invaluable. R is used for most stats courses, our databases course uses C++, and many engineering courses use MATLAB.
CS 532: Matrix Methods in Machine Learning, 3 credits — NOTE: unlikely this will be part of the Data Science major, but it’s a very good applied course. I was briefly in it before switching into a different course. Seemed interesting. It’s a flipped classroom, so you watch some videos outside, and then work on homework in-class. May or may not be reserved for graduate students next spring? Unsure, but that would be odd.
– Pre-Reqs: MATH 222 and (E C E 203, COMP SCI 200, 300 or 302) or graduate/professional standing
• Math 234: Calc III, 4 credits — Easier than calc 2, pre-requisite for Foundational courses like the Stats 309/310 sequence (Mathematical Statistics and probability)
– Pre-Reqs: MATH 222
Math 340: Linear Algebra, 3 credits — lovely course. Avoid its sibling course, Math 320, if you don’t like differential equations.
– Pre-Reqs: MATH 222
• Stats 301 or 302: Intro to Statistics, 3 credits — very likely be a pre-requisite for other courses required Stats courses
– Pre-Reqs: None for STAT 301; MATH 221 in order to take STAT 302
• Stats 327: R, 1 credit — This will just be a useful language, whether for research or other classes. They have intro, intermediate, advanced, all in one semester. If you’re more CS-focused, you could jump for CS 368 – R (caveat, you’ll need to get a credit substitution for 327)
Stats 333: Applied Regression Analysis, 3 credits — honestly wish I could have taken this. Only open to stats majors, it seems.
– Pre-Reqs: (STAT 224, STAT 301, STAT 302, STAT 312, STAT 324, or STAT 371) and STAT 327 or concurrent enrollment
It’s been a good semester guys. Thanks for showing up!
If you have any other questions about scheduling or courses to take, please feel free to email me at, or our VP and project coordinator Adi at

SAS and Genetic Algorithms Resources (10/25/2018)

Genetic Algorithms:

If you have more questions about Genetic Algorithms, email Matthew at:


Some reminders from Rachel (from SAS):

The registration for Student Symposium  is November 16th, teams of 2-4 with a faculty advisory will compete in a datachallenge with a data set with SAS software to use. The top 3 teams will be highlighted at the Global Forum in April 2019.

SAS E-Learning – Access code is : G70007601
Free University Edition software
Free Book on Data Science

Apply to Internships on Handshake or here

If you have any questions for Rachel regarding SAS, contact her at:

Previous Meeting Notes (10/11/2018)

Thanks to everyone who came! We had a few members speak about their experience with research, how networking played a role, and how they initially reached out to the Primary Investigators. For tips on your résumé + getting into research (with sample emails), check out the slidesAlso see the attached PDF that lays out what to avoid on your résumé and during elevator pitches.