Learning ML by Building a Tiny Password Strength Classifier
I finally sat down and built a machine learning model in Jupyter Notebook that actually does something cybersecurity related. Not a big fancy neural network. Not a GPT clone. Just a tiny password strength classifier that helped me understand the full ML pipeline without frying my brain.

This whole thing started because I kept telling myself I would learn ML one day. And one day never comes when you wait for the perfect idea. So I forced myself to build something so stupid simple that I couldn’t run away from it.
Turns out, that worked.
What I wanted to build
I wanted a model that takes a password and predicts whether it is strong or weak based on its structure. Nothing about leaks, entropy, breached databases or cracking times. Just pure structural features, built on Jupyter Notebook.

I kept it simple and picked four things to analyse:
- length
- uppercase letters
- digits
- symbols
The idea was to convert each password into a set of numerical features like:
length, has_uppercase, has_digit, has_symbol
So a password like Abc123!@ becomes something like 8, 1, 1, 1.
Creating rules that didn’t fight me
This was the hardest part. Not the ML. Not the code. Just defining what I believe a strong password is.
At first I made the rule way too strict and then changed it repeatedly. That made the dataset contradictory and the model learned nonsense. Eventually I locked in what actually made sense:
- Password must be at least 8 characters long
- And it must have at least 2 out of these 3:
- uppercase
- digit
- symbol
So something short like 8B$ is weak even though it has good complexity. And something long like averystrongpassword is weak because it has no variety. Finally, the rules aligned with my intuition.
Writing the labeler function
I wrote a tiny function that checks each password and turns it into a 0 or 1 based on the rules.
|
|
Once I had this, I relabeled my dataset and everything started behaving predictably.

Turning passwords into features
My extractor function was tiny too:
|
|

This gave me clean numerical data I could feed into scikit learn.
Training the model
After that, the ML part was almost boring. In a good way.
|
|
I evaluated it with classification_report and it performed exactly how you’d expect on such a tiny dataset. Not perfect, but good enough to prove that:
- my labels made sense
- my features were consistent
- the model actually learned the pattern instead of memorising random junk

What I learned

Honestly, the biggest lesson wasn’t about ML. It was about myself.
- I overthink everything when I try to learn something new
- Simple models are the best place to start
- Jupyter notebooks make experimentation painless
- ML is not scary once you run a full cycle end to end
- A small dataset is actually a blessing when you’re trying to understand the process
This little password strength classifier is nowhere near real world use cases, but it taught me how ML actually works instead of how it works in theory.
What’s next
I might expand it with more features like:
- checking for dictionary words - because seclists is very beefy
- repeated patterns
- keyboard adjacency
- entropy approximations
But I’ll do it one step at a time. The whole point of this exercise was to stop trying to build the final boss on day one.
And honestly, making this tiny password checker did more for my ML understanding than any tutorial ever has.
Comments