Review: IBM Data Analyst Certification on Coursera
Overall rating: C-
I’ve been using SQL for a long time–probably 20 years at this point–but it’s been pretty sporadic, and usually for personal projects. In that time, I’ve mostly gotten by with online resources and my trusty Ben Forta book (which still holds up, because SQL really doesn’t change much). Even when I started using it extensively at work last year to pull data out of Blackboard, Ben Forta came through. But I decided I’d like to learn something about modern data analysis, especially data visualization, dashboarding, and Python. I thought from looking over the courses in the certificate program that it would offer what I was after. (I was wrong, but I’ll get to that later.)
Coursera has, of course, been around for a long time. There are many credible institutions and corporations that provide content for its courses and back its programs.
Coursera offers two ways to pay: you can sign up for Coursera Plus, which will set you back $399 a year, or you can pay for certificate programs by the month ($39.95), so the faster you can power through the courses, the less you pay. I suppose if you’re really dedicated to getting Coursera certificates, Coursera Plus could be a bargain.
The IBM Data Analyst Professional Certificate program consists of nine courses, including a capstone project. They are:
- Introduction to Data Analytics
- Excel Basics for Data Analysis
- Data Visualization and Dashboards with Excel and Cognos
- Python for Data Science, AI & Development
- Python Project for Data Science
- Databases and SQL for Data Science with Python
- Data Analysis with Python
- Data Visualization with Python
- IBM Data Analyst Capstone Project
I chose this path because I wanted to learn Python. I expected that the courses were ordered in a way that would let me build on previous courses: that I’d learn enough in each course to apply my knowledge to the next course in the progression. There were two problems with that assumption: 1) I didn’t actually learn enough in anything past the Excel and Cognos courses to be able to apply it later; and 2) it’s pretty hard to take the courses in order because of the way Coursera works.
See, once you start enrolling in courses, they no longer appear in the order in which you’re supposed to take them in the certificate program; they appear in the order in which you enrolled. So if you enrolled in several of them during your trial period to see if they looked like good courses, that’s how they’re going to show up for you unless you unenroll (I guess).
This is how I ended up trying to complete the Python Project for Data Science before taking the prerequisite course. Not that it mattered; when I went back to the project after taking the prereq, I realized that I didn’t have to actually know anything from the course–I just needed to copy and paste from the labs. Which I suppose is a plus if you’re just trying to get the certificate without bothering to learn anything, and judging from my experience with peer reviewing, I suspect a good 75-80% of my peers are doing exactly that.
To be fair, one reason I enrolled in this program was to be able to put it on my resume as a way to get through applicant tracking systems, so I can’t really fault anyone for that. But I did think I might learn some skills as well.
The course material for the first several classes, from the introduction through the Excel and Cognos course, was actually pretty good, and your course enrollment gives you 90 days of free IBM Cloud access, rather than the standard 30-day trial. If you want to play around with Cognos, this is a good way to do it.
The rest of the courses focus on Python and some of its libraries, including pandas, Matplotlib, plotly, and scikit-learn. The webscraping course uses Beautiful Soup. In most cases, there’s a brief (three- to four-minute) video that barely explains what each of these libraries is used for. There is no restriction on speeding up the video. I watched most of them at 1.5x.
Tests and quizzes
The quizzes were mostly very simple, often just asking a question or two to see if you’d watched the video. The tests were often even simpler.
The organization of the tests was… not great. I work with instructors regularly to construct tests in Blackboard; I know that the technological aspects of test construction are not necessarily intuitive. But even the least tech-savvy instructors I work with understand how randomization works when they’re setting up a test. If you give options like “all of the above” or “A and B, but not C,” you need to have the options in a fixed order. Many of the tests I took in these courses had “all of the above” in the first, second, or third position instead of the last; likewise, sometimes the option that was “A and B” would be in the first or second position, with nothing labeled “A” or “B.” How is anyone supposed to answer that question?
In some questions, the answers were vast swathes of unformatted code. I know, I know; here I’ve been complaining about the lack of rigor, and then I’m complaining that this bit is too rigorous. If I thought this was done intentionally to get learners to sift through the code to verify it, that would be one thing; but the impression I got was that the people who entered it in whatever test-generation software they’re using just didn’t know how to format it correctly.
The peer-reviewed assignments are just plain bad. Here’s how it works: you submit your assignment, and when you’re done, you review two (usually; one assignment required four) of your peers’ assignments. In order to do this, you’re given a rubric. Here are my issues with that:
My cheating peers
So here’s the deal: a lot of people seem to just be doing this to get the certificate, without putting in any work whatsoever. You’ll open up someone else’s assignment and see that they’ve uploaded random files, or have entered asdfjkl; or something in a text field. I’m certain that the people who do that don’t review anyone else’s assignment either; they just go through and give the full points for everything. This kind of behavior is rampant in this course, and from what I’ve heard in online reviews and Reddit, in other courses as well. After a while, I realized that if someone did any work at all on the first two questions, they probably got the rest of them right; they were hard to get wrong if you were putting in even a mediocre effort.
A couple of times I encountered assignments that were probably AI/LLM-generated, and in that case, all I can say is: DUDE. WHY? Save your ChatGPT prompts for something more important (or at least more fun) than this pointless certificate program.
A lot of people aren’t familiar with rubrics and how to apply them. But beyond that: in these courses, the rubrics are poorly written or hard to understand. I guess that doesn’t matter if you’re just giving everyone full points, whether or not they did the work, but if you’re trying to grade someone fairly, these rubrics don’t really help.
Downloading files, tiny screenshots, and untrusted links
Often, especially in the Excel-based courses, you have to actually download someone’s Excel spreadsheet (they recommend using the online version of Excel, but I could not get some of my classmates’ spreadsheets to open in it). For many of the assignments that required uploading a screenshot of the results, I had to zoom waaaay out to get the entire thing in a screenshot, and I assume others did as well, based on how tiny the print was on many of them. Zooming in, as the instructions tell you to do, doesn’t help if the image itself is very small. And then there are the links: while I didn’t encounter any that led me to a malicious site, it did occur to me that it could happen, especially given how many people posted completely off-the-wall, unrelated files. I certainly was very cautious about visiting any of these links.
Ugh. The discussion boards.
First off, most of the posts in the discussion boards are in the category of “PLEEZE REVIEW MY ASSIGNMENT I WILL REVIEW URS.” I couldn’t find a good way to filter those; it would really help if Coursera would implement (and enforce) threaded discussions, or even tags.
Within each course, the discussion boards are organized by week. And that’s it. Week 1 has a discussion board; Week 2 has a discussion board. It is possible to search in each week, BUT: apparently, each occurrence of the word you search for gets its own result, so you’ll end up with 38 pages of search results, with the same posts occurring over and over. “Aha!” you’ll think. “Here’s someone else who can’t find their Excel spreadsheet in IBM Cloud after saving it as directed in the capstone project!” (There were at least 78 distinct people with this issue, btw). You’ll read the nearly incomprehensible answer posted by the staff, you’ll try it out, it won’t work. You’ll go to page 2 and click on another post… but it’s actually the same post. Search is just not functional in the Coursera discussion boards.
There is a good side to the discussion boards: the staff usually posts the complete answers to every lab if anyone asks even a basic question about it. I personally think most of the “staff” replies are canned responses fired off by a bot; often the answer doesn’t have much to do with the question, and seems to be triggered by keywords rather than actually reading the question. But you could probably get through every single course in this program without ever doing any of the labs yourself, just by going through the discussion boards and copying responses from the staff.
In the end, I finished all the courses but not the capstone, and here’s why: I started Week 1 of the capstone on a Saturday. I dutifully followed the directions, set up my Watson Studio Lite account, imported my Jupyter notebooks, and began the assignment. It was kind of a fun assignment (compared to the others in the program, anyway): collecting jobs data from a Github dataset and using Openpyxl to write it to Excel. All of that seemed to work great, and I was kind of excited: at last, something to DO!
I got stumped at the same point that at least 78 of my peers did: the Excel spreadsheet that I had supposedly saved with Openpyxl wasn’t where it was supposed to be. I tried all the things I could find posted by “staff” in the discussion boards, with no luck. I had other things to do, so I saved it and decided to come back later and try.
On Sunday, when I resumed the lab, I got this:
So as usual, I went to the discussion boards to see what I could find. There, “staff” recommended doing the assignments locally instead of using Watson Studio. And so I spent another hour installing and configuring Jupyter Notebooks on my computer (oh, yeah: since this is a new computer, I first had to enable WSL; back out of WSL and change my Ubuntu password because I had apparently already set up WSL but forgotten my password; update everything; and install pip before I could do that) before deciding that I just don’t care enough about this certificate to continue working around Coursera’s and IBM’s deficiencies.
My conclusion: This certificate program is not worth the time and money. If your employer is paying for you to do this and is giving you time on the clock, then go for it. If you want to actually learn the skills that are promised by this course, but without a Credly badge, here are some ideas:
- Microsoft Learn has a wealth of free Excel resources. In my experience it’s really easy to get lost down a rabbit hole in Microsoft Learn; if you have difficulty finding your way out of rabbit holes (I do) you might want to try something else.
- LinkedIn Learning: There are a TON of Excel courses and paths on LinkedIn Learning. I can’t really speak to most of them, but generally LiL has great courses. If you’ve been using Excel for a while and want to go straight to the MS cert prep course, you can never go wrong with a Jen McBee course. If you’re not from the South, you might have to speed Jen up a little. I am from the South and still usually crank her up to at least 1.25. If you live someplace with a public library, or if you have an account at an academic library, there’s a pretty good chance you get access to LinkedIn Learning for free.
Seriously, just get the free trial and play with it. It’s really fun and easy to use.
- If you want to learn Python for a little bit of money, try Angela Yu’s 100 Days of Code when Udemy has a sale (which seems like every other week or so). I don’t think I’ve ever seen it under $19.99, but it’s still a bargain. Angela Yu will not just teach you Python; she’ll teach you how to take an online course. This course is so good that it actually seems a little unfair to judge the IBM courses against it, except that I spent $20 on 100 Days of Code and $80 on IBM’s certificate program, like a sucker. Note: actual thinking and problem-solving is required to get through 100 Days of Code.
- Buy a No Starch book–pretty much any No Starch book with Python in the title will do, but you can never go wrong with Automate the Boring Stuff with Python (which is also a course on Udemy).
- Check out Skillshare and get a whole year of unlimited courses for what you’ll pay for four months of the monthly plan on Coursera. I have not taken a Python class at Skillshare, but I’ve taken many others, and they’ve mostly been excellent. Even the ones that are only so-so are still better than the IBM Data Analyst Certificate program.
The Bottom Line
After completing the courses in the IBM Data Analyst Professional Certificate, I’ve lost respect both for IBM and for the University of North Texas, which apparently offers college credit for what is essentially a series of mediocre tutorials. Other Coursera programs might be better, but this certificate is just about worth the paper it’s printed on….and it’s a digital certificate.