Welcome to MATH 615

Data Analysis for Graduate Research

Robin Donatello

2023-08-21

Meet your instructor

Dr. Robin Donatello (she/her)

You can address me as “Robin”, “Dr. D”, or some other respectful title.

I have a Doctorate in Public Health (DrPH) Biostatistics from UCLA, but I’m a Chico alum. I double majored in Statistics & Biology, with minor in Chemistry, and a first generation college student who started at Butte College.

My campus life consists of training the next generation of Scientists how to harness the power of Statistics and Data in a responsible and ethical manner, leading the Data Science Initiative (DSI) provide training and experiences for students and faculty, and providing analytical support and statistical consulting for many projects on and off campus.

When I’m not on campus, typically I’m growing food for my family, out adventuring with my dogs, or getting some game time in. You can learn more about the projects i’m involved in on my website.

Your turn

  • Turn to your neighbor and introduce yourself.
  • You’ll post an introduction in Discord this week so everyone will get to know everyone else.

Code of Conduct

Everyone is welcome here

It is my intent that students from all diverse backgrounds and perspectives be well-served by this course, that students’ learning needs be addressed both in and out of class, and that the diversity that the students bring to this class be viewed as a resource, strength and benefit.

It is my intent to present materials and activities that are respectful of diversity: gender identity, sexuality, disability, age, socioeconomic status, ethnicity, race, nationality, religion, and culture.

Supportive Learning Environment

I would like to create a learning environment that supports a diversity of thoughts, perspectives and experiences, and honors your identities (including race, gender, class, sexuality, religion, ability, etc.)

To help accomplish this:

  • Let me know if you have a name and/or set of pronouns that differ from those that appear in your official Chico records.
  • Help me pronounce your name as accurately as possible. Corrections are welcome.
  • If you feel like your performance in the class is being impacted by your experiences outside of class, please don’t hesitate to come and talk with me. I want to be a resource for you.

What is this class about?

Learning goals

  • Developing the skills to conduct statistically valid and reproducible research.
  • Understanding how data needs to be structured and formatted for analysis, so you can better prepare data collection methods for future research.
  • Practicing the skills to be the boss of your own data without relying on others to “run the numbers” for you.
  • Learning basic statistical techniques for a small selection of analysis situations.
  • Learning how to do all of this in a reproducible manner to save you headache and time during your research.
  • Laying the statistical foundation so you can learn to apply more advanced statistical models as needed, such as those covered in Applied Statistics II (Math 456).

See the syllabus for more detailed learning objectives.

How are we going to accomplish all that?

Lots of course materials and tools

  • I don’t sanitize this class for you by keeping everything in a learning management system like Canvas.

  • In a working environment you have to deal with multiple platforms, multiple accounts and manage multiple locations for files and content. I use the best tools for the job.

  • Homework 0 provides a checklist for getting connected and testing out your tools.

Class website

https://math615.netlify.app/

  • Bookmark this page. You will be here a lot.
  • Contains details on weekly topics, links to notes, assignments and additional materials
  • Sometimes links will be broken. Typo’s happen. Notify me via Discord and I’ll get to it asap.

Textbook

  • The textbook is used for reading and learning content at a more in depth level than my overview slides.
  • This will be used in Math 456, and is an excellent reference guide.
  • If you’re curious, I get about 1% of the sale in royalty. The income is not why I collaborated on this edition. I used the 4th edition in grad school and really like teaching out of it.

Canvas

  • Non-project assignment submission
  • Grades
  • Schedule & Due dates

Google Drive

  • Quizzes (Individual, and Group)
  • For collaborative feedback on your research project
  • You have been added to the Math 615 Google drive using your campus email. (@mail.csuchico.edu)

Discord

  • Used for outside class discussions, meme sharing, homework help and general chatter.
  • I will not answer most class-content based questions through email.
  • Download either the phone app or the desktop app (I use both). This is mandatory.
    • Do not rely on remembering to log in via the web browser. You will miss important notifications.

Lecture notes (slides)

  • Stand alone lecture notes like these.
  • Click on three lines in the bottom right to see options for exporting a copy
  • Collaborative lecture notes on Hack MD (more info below)

Video

  • Sometimes I will record lectures and want you to watch them before class.
  • This frees up in class time for actively working on homework or your project with your peers
  • These will be posted in Canvas as necessary

Chat GPT

  • I recognize there are a variety of AI programs available to assist in creating text and writing code.
  • Not a replacement for human creativity, originality, and critical thinking.
  • Writing and coding are skills that you must nurture over time in order to develop your own individual style and understanding.
  • The use of chat GPT is allowed/encouraged to help you learn how to code but all code used must be fully explained in text. (Week 3)
  • The syllabus has more details on the approved use of language models to write text.

Hack MD

  • Interactive collaborative notes
  • No account needed to view, but account needed for editing (which you will be doing). I recommend using your campus email.

Time Commitment

  • Expect to spend 10-12 hours each week on this class.
    • Note that this is 1-2 hours more than an undergraduate 3 unit course would take (9 hrs). This is to acknowledge that it may take you longer than expected to complete the work until you gain practice coding.
  • I will do my best not to move due dates so you can plan accordingly.

Research Project

Project Overview

  • Semester long project
  • Individual projects with peer review collaboration
  • All assignments are designed to support your research.
  • Must choose a project out of select data sets.
    • Individual research is typically not developed or robust enough to be demonstrative.
  • Project will culminate with the creation and presentation of a research poster.
  • More details are on the project page.

Data Analysis Software

Welcome to modern Statistics & Data Analysis

  • No more TI-83, modern statistics is computational based.
    • And I don’t mean Excel.
  • Big push for open research in the Natural and Social Sciences.
    • Sharing code & data. Sometimes required along with manuscript for publishing.
  • Reproducibility. Give someone else access to your data and code, and they can replicate your findings.
    • We will practice this every day in this class.
  • I’m not the only one teaching reproducible research.
  • Expect to bring your laptop every day to class.
    • The more reading and content engagement done outside of class, the more time for in class analysis and discussion

Software program of choice (SPC)

  • This class is not a class on how to use the software program.
  • You will be responsible for learning a lot of the programming language outside of class on your own time.
    • Part of a MS program is learning how to learn.
  • The help page on the website has a section on programming resources.
  • All my lecture notes use R with some examples in SPSS.
  • I will not dictate which software program you use in this class.
  • I will expect you to submit reproducible code.
    • You can point and click your way to an answer, but code must be saved and reusable with minimal changes.

Software cont.

  • Be open to new things, there is power in being polyglottal. You can use one language in here and another language for another class or project.
  • Your professors in your other classes, or your masters committee may want you to learn a specific language. So should your industry.
  • Don’t make assumptions, look at job postings and see what they want.
    • E.g. Center for Healthy Communities has a lot of Nutrition faculty/students who typically use SPSS, but the Research & Evaluation Team is solidly using R.
  • Political Science & Economics often use Stata. I’ve met Sociologists that use SAS, Stata, R and SPSS.
  • I am fluent in SAS, STATA and R, reasonably read in SPSS but can provide limited support for Python.
  • I highly recommend making a habit to attend Community Coding even if it’s just to sit and do your work with your classmates in a supportive environment.

Syllabus

This overview is not a replacement for reading the syllabus. That document contains more information on policies and procedures that you need to be aware of. You can access an HTML version on the course website, and a PDF version from inside Canvas.

You can find it

Questions?