A data analyst portfolio with 3–5 projects beats a resume with 10 certifications. Each project needs: a real-world dataset (not Titanic or Iris), a clear business question, data cleaning and analysis in SQL or Python, a visualization in Tableau or Power BI, and a written summary explaining the "so what." The best portfolios are hosted on GitHub with clean READMEs and include at least one published interactive dashboard on Tableau Public.
This article was researched and written by the Careery team — that helps land higher-paying jobs faster than ever! Learn more about Careery →
How many portfolio projects does a data analyst need?
3–5 projects is the sweet spot. Fewer than 3 doesn't demonstrate range. More than 7 dilutes quality. The ideal portfolio has one beginner project (data exploration), two intermediate projects (multi-source analysis with dashboards), and one advanced project (end-to-end analysis with business recommendations). Quality matters far more than quantity.
What datasets should I use for data analyst portfolio projects?
Use real-world, messy datasets from sources like Census.gov, WHO, NYC Open Data, CMS (healthcare), or Kaggle competition datasets. Avoid tutorial staples like Titanic, Iris, and mtcars — hiring managers have seen them hundreds of times and they signal tutorial completion, not analytical thinking. The messier the dataset, the better — data cleaning is 60–70% of real analyst work.
Where should I host my data analyst portfolio?
GitHub for SQL scripts, Python notebooks, and project documentation. Tableau Public for interactive dashboards. A personal website or GitHub Pages for a portfolio landing page. LinkedIn for visibility. The minimum viable portfolio: a GitHub profile with 3–5 repositories and at least one Tableau Public dashboard. No personal website is needed to get hired, but it helps.
A hiring manager reviewing 200 applications for a data analyst role spends an average of 6–10 seconds on each resume. Most resumes list the same tools — SQL, Tableau, Python — with no proof of competence. The candidates who get interviews are the ones who link to a portfolio where the hiring manager can see the actual work. Portfolios are the great equalizer: they let career changers, self-taught analysts, and bootcamp graduates compete directly with candidates from top universities.
But most portfolio advice is generic: "do a project." Which project? With what data? Showing what skills? This guide answers those questions with specific, actionable project ideas that demonstrate the skills hiring managers actually evaluate.
Resumes describe skills. Portfolios prove them. For entry-level and career-change candidates, this distinction is career-defining.
A portfolio does three things a resume can't:
- Demonstrates process — not just "knows SQL" but how SQL is used to answer business questions
- Shows communication — a well-written README proves the ability to explain technical work to non-technical readers
- Proves initiative — building projects without being assigned them signals self-direction
For the full path from zero to hired — including skills, education, and job search strategy — see How to Become a Data Analyst in 2026.
Portfolios prove competence in a way that resumes and certifications cannot. A data analyst with 3 strong portfolio projects and no degree will outperform a candidate with a degree and no portfolio in most hiring processes.
Not all projects are created equal. Understanding what hiring managers evaluate separates impressive portfolios from forgettable ones.
The anatomy of an impressive project:
- Business question — A clear, specific question that a real company would ask
- Real dataset — Messy, multi-table, from a credible public source
- Data cleaning — Documented steps showing how the raw data was prepared
- Analysis — SQL queries, Python code, or both — with explanations
- Visualization — Charts or dashboards that tell the data story
- Findings & recommendations — What the data shows and what action it suggests
- Clean README — Professional documentation that a hiring manager can scan in 30 seconds
Every portfolio project needs a business question, a real dataset, documented data cleaning, analysis with visualizations, and written recommendations. The README is as important as the code — it's what hiring managers read first and often the only thing they read.
Here are 13 specific projects — organized by difficulty — that demonstrate the exact skills hiring managers evaluate.
These projects demonstrate foundational skills: SQL querying, data exploration, basic visualization, and the ability to communicate findings clearly. Complete 2–3 of these before moving to intermediate projects.
E-Commerce Sales Analysis
Business question: Which product categories drive the most revenue, and how do sales patterns vary by season?
Dataset: Brazilian E-Commerce Dataset (Kaggle) — 100K+ orders with product, customer, and review data across multiple tables.
Tools: SQL (JOINs, GROUP BY, aggregations), Excel or Python for charts
Deliverables:
- 10–15 SQL queries analyzing revenue by category, time period, and region
- 3–5 charts showing seasonal trends and category performance
- Written summary with 3 actionable recommendations
What it demonstrates: SQL fundamentals, multi-table analysis, business metric interpretation
Public Health Dashboard
Business question: How do COVID-19 vaccination rates correlate with case outcomes across US states?
Dataset: CDC COVID-19 Data — vaccination rates, case counts, hospitalizations by state and date.
Tools: Python (pandas) for cleaning, Tableau for dashboard
Deliverables:
- Data cleaning notebook showing how CDC data was standardized
- Interactive Tableau dashboard published on Tableau Public
- Analysis comparing vaccination rates to hospitalization outcomes
What it demonstrates: Data cleaning with real government data, Tableau proficiency, public health context
HR Employee Attrition Analysis
Business question: What factors predict employee turnover, and which departments are at highest risk?
Dataset: IBM HR Analytics Employee Attrition (Kaggle) — 1,470 employees with 35 features.
Tools: SQL or Python (pandas), Excel or Tableau for visualization
Deliverables:
- Exploratory analysis identifying top 5 factors correlated with attrition
- Comparison of attrition rates across departments, salary bands, and tenure
- Recommendations for HR retention strategies
What it demonstrates: Exploratory data analysis, correlation analysis, business recommendations
City Bike-Share Usage Patterns
Business question: When and where are bike-share trips most concentrated, and where should the city add new stations?
Dataset: Citi Bike NYC Trip Data — millions of trip records with start/end stations, times, and user types.
Tools: SQL for querying large datasets, Tableau or Power BI for mapping
Deliverables:
- SQL analysis of peak usage times, popular routes, and station utilization
- Geographic visualization showing trip density by station
- Recommendations for new station placement based on demand patterns
What it demonstrates: Large dataset handling, geospatial visualization, infrastructure recommendations
Personal Finance Spending Analysis
Business question: Where does money go, and what spending categories have the most optimization potential?
Dataset: Your own bank/credit card transaction exports (anonymized), or the Synthetic Financial Dataset (Kaggle).
Tools: Python (pandas) for categorization and cleaning, Excel or Tableau for visualization
Deliverables:
- Data cleaning script that categorizes raw transactions
- Monthly spending breakdown with trend analysis
- Dashboard showing spending patterns and savings opportunities
What it demonstrates: Data cleaning with messy real-world data, categorization logic, personal relevance
Beginner projects demonstrate SQL fundamentals, basic data cleaning, and the ability to answer a clear business question. Complete 2–3 beginner projects before moving to intermediate — they form the foundation of the portfolio.
Intermediate projects raise the bar: multi-source data, more complex analysis, and polished dashboards.
These projects demonstrate deeper analytical thinking: combining multiple data sources, statistical analysis, and dashboard design that tells a complete story. They're the projects that actually get you hired.
Airbnb Market Analysis
Business question: What factors drive Airbnb pricing in a major city, and where are the most undervalued listing opportunities?
Dataset: Inside Airbnb — listing details, reviews, pricing, and location data for major cities worldwide.
Tools: Python (pandas, matplotlib), SQL, Tableau
Deliverables:
- Multi-variable analysis of price drivers (location, property type, amenities, reviews)
- Neighborhood comparison dashboard on Tableau Public
- Statistical summary of which features most influence pricing
- Written recommendations for hypothetical new hosts
What it demonstrates: Multi-variable analysis, real estate market understanding, interactive visualization
Customer Segmentation for E-Commerce
Business question: Who are the distinct customer segments, and how should marketing strategy differ for each?
Dataset: Online Retail Dataset (UCI ML Repository) — 500K+ transactions from a UK-based online retailer.
Tools: Python (pandas), SQL, Tableau or Power BI
Deliverables:
- RFM analysis (Recency, Frequency, Monetary) calculating customer value segments
- Customer segmentation with 4–6 distinct groups and behavioral profiles
- Dashboard showing segment characteristics and recommended marketing strategies
- Executive summary with 3 actionable recommendations per segment
What it demonstrates: Customer analytics, RFM methodology, segmentation, strategic recommendations
Supply Chain Performance Dashboard
Business question: Where are the bottlenecks in the supply chain, and which suppliers consistently underperform?
Dataset: DataCo Supply Chain Dataset (Kaggle) — 180K+ orders with shipping, inventory, and supplier data.
Tools: SQL for querying, Python for analysis, Tableau for dashboard
Deliverables:
- Supplier performance scorecard (on-time delivery, defect rate, lead time)
- Geographic analysis of shipping delays by route and region
- Interactive dashboard tracking 8–10 supply chain KPIs
- Written analysis identifying top 3 bottlenecks with improvement recommendations
What it demonstrates: Operations analytics, KPI dashboard design, supplier evaluation
Social Media Engagement Analysis
Business question: Which content types and posting patterns drive the highest engagement, and what should the content strategy be?
Dataset: Social Media Sentiments Dataset (Kaggle) or scrape your own data from a public page using an API.
Tools: Python (pandas, matplotlib), Tableau
Deliverables:
- Engagement analysis by content type, posting time, and day of week
- Sentiment distribution analysis using basic text patterns
- Content strategy recommendations backed by engagement data
- Dashboard showing engagement trends and optimal posting windows
What it demonstrates: Marketing analytics, content strategy, time-series analysis
Hospital Readmission Risk Analysis
Business question: Which patient characteristics predict 30-day hospital readmission, and how can the hospital reduce readmission rates?
Dataset: CMS Hospital Readmissions Data or MIMIC-III Demo (PhysioNet).
Tools: Python (pandas), SQL, Tableau
Deliverables:
- Analysis of readmission rates by diagnosis, age group, and length of stay
- Correlation analysis identifying top risk factors for readmission
- Dashboard showing readmission patterns across hospital departments
- Recommendations for intervention programs targeting high-risk patients
What it demonstrates: Healthcare analytics, risk analysis, domain-specific knowledge, compliance awareness
Every portfolio project should translate into a resume bullet. For the exact formula and templates, see Data Analyst Resume Guide.
Intermediate projects demonstrate the ability to work with complex, multi-source data and produce dashboard-level deliverables with business recommendations. These are the projects hiring managers weigh most heavily — they're closest to actual analyst work.
For candidates targeting senior-level or specialized roles, advanced projects demonstrate leadership-level analytical thinking.
These projects signal senior-level thinking: end-to-end analytical rigor, sophisticated methodology, and the ability to drive business decisions. Include one advanced project to stand out in competitive applicant pools.
A/B Test Analysis Framework
Business question: Did a product change improve conversion rates, and was the result statistically significant?
Dataset: Create a synthetic A/B test dataset or use Kaggle A/B Testing Datasets.
Tools: Python (scipy, pandas), SQL, Jupyter Notebook
Deliverables:
- Statistical analysis: hypothesis test, p-value, confidence interval, effect size
- Sample size calculation and power analysis
- Visualization of conversion funnels for control and treatment groups
- Written executive summary explaining results in non-technical language
- Reusable A/B testing template in Python
What it demonstrates: Experimental design, statistical rigor, executive communication
End-to-End Business Intelligence Pipeline
Business question: Build a complete analytical pipeline from raw data to executive dashboard, with automated refresh.
Dataset: Any large public dataset (NYC taxi data, Census, or OpenWeather API).
Tools: SQL (data warehouse queries), Python (ETL script), Tableau or Power BI (dashboard), GitHub (documentation)
Deliverables:
- Python script that extracts, transforms, and loads data
- SQL views and aggregations for efficient dashboard querying
- Executive dashboard with 10+ KPIs, drill-down capability, and auto-refresh
- Technical documentation covering data flow, refresh schedule, and maintenance
What it demonstrates: Full-stack analytics thinking, pipeline design, production-quality work
Market Entry Analysis
Business question: Should a hypothetical company expand into a new geographic market, and which market offers the best opportunity?
Dataset: Combine Census data, BLS employment data, industry-specific datasets, and economic indicators.
Tools: Python (pandas), SQL, Tableau, Excel (financial model)
Deliverables:
- Multi-source data integration from 3+ public datasets
- Market scoring model with weighted criteria (population, income, competition, growth)
- Financial projection model in Excel showing revenue scenarios
- Executive presentation (5–7 slides) with market recommendation
- Complete analytical appendix with methodology and data sources
What it demonstrates: Strategic analysis, multi-source data integration, financial modeling, executive communication — the complete skill set of a senior analyst
Advanced projects demonstrate what separates senior analysts from mid-level: the ability to define the question, integrate multiple data sources, apply rigorous methodology, and present findings at the executive level. One advanced project in a portfolio signals readiness for higher-level roles.
Projects are only valuable if hiring managers can find and evaluate them. Presentation matters.
# [Project Title]
Business Question
[One sentence describing the problem this analysis solves]
Dataset
Tools Used
Key Findings
1. [Most important finding with specific number]
2. [Second finding]
3. [Third finding]
Recommendations
Files
Methodology
[2-3 paragraphs explaining approach, assumptions, and limitations]
Portfolio presentation checklist:
- Every project has a clean README that a hiring manager can scan in 30 seconds
- SQL files include comments explaining business logic
- Jupyter notebooks have markdown cells explaining each analysis step
- At least one interactive dashboard is published on Tableau Public
- The GitHub profile README links to all projects with one-line descriptions
A portfolio is one piece of personal branding. For a complete strategy on building professional visibility as a data analyst, see Personal Branding for Data Analysts.
- 013–5 portfolio projects beat 10 certifications — portfolios prove competence, credentials prove course completion
- 02Use real-world, messy datasets from Census.gov, WHO, NYC Open Data, or Kaggle competitions — avoid tutorial staples like Titanic and Iris
- 03Every project needs: a business question, documented data cleaning, analysis, visualization, and written recommendations
- 04Beginner projects demonstrate SQL and data exploration; intermediate projects show multi-source analysis and dashboards; advanced projects signal strategic thinking
- 05Host on GitHub with clean READMEs and publish dashboards on Tableau Public — presentation quality matters as much as analytical quality
- 06The README is the most important file in each project — it's what hiring managers read first (and often the only thing they read)
Can I use Kaggle datasets for my portfolio?
Yes — but avoid the most common tutorial datasets (Titanic, Iris, mtcars, Boston Housing). Use Kaggle competition datasets or less common datasets from the Kaggle Datasets section. The key is choosing datasets that are messy enough to demonstrate real data cleaning skills and complex enough to support meaningful analysis.
Do I need a personal website for my portfolio?
No. A well-organized GitHub profile with clean READMEs and a Tableau Public profile with published dashboards is sufficient to get hired. A personal website adds polish but isn't required. If you build one, keep it simple — a landing page with project descriptions and links is enough.
How long should each portfolio project take?
Beginner projects: 1–2 weeks. Intermediate projects: 2–3 weeks. Advanced projects: 3–4 weeks. These timelines assume 10–15 hours per week of focused work. The most common mistake is spending too long on one project instead of building a portfolio with range.
Should I include group projects or only solo work?
Solo projects are stronger for portfolios because they clearly demonstrate individual capability. If you include a group project, clearly document your specific contribution — which analyses you ran, which dashboards you built, which sections of the report you wrote.
What if my portfolio projects use different tools than the job posting requires?
The analytical thinking transfers across tools. A strong Tableau project demonstrates dashboard design skills that apply to Power BI. A pandas analysis demonstrates data manipulation skills that apply to SQL. Focus on demonstrating analytical process and business thinking — tool-specific skills are the easiest part to learn on the job.
Prepared by Careery Team
Researching Job Market & Building AI Tools for careerists · since December 2020
- 01Storytelling with Data: A Data Visualization Guide for Business Professionals — Cole Nussbaumer Knaflic (2015)
- 02Occupational Outlook Handbook: Data Analysts and Scientists — Bureau of Labor Statistics (2025)
- 03Inside Airbnb: Adding Data to the Debate — Inside Airbnb (2025)