Skip to main content

“Mastering Data Exploration with Prompts: A Beginner’s Guide to Demystifying Data Analytics” 📊


Have you ever had to tackle problems during exploratory data analysis involving multiple files in a large dataset? Even if you’re unsure where to begin, how to write code in programming languages like Python, R, and SQL, or how to set up environments for data analysis.  

Here is a “Magic Wand” that allows you to complete these processes without any code. Simply use natural language prompts with advanced AI tools based on LLMs. It serves as a powerful shortcut for initial exploration.  

Through effective communication with AI tools for data analysis, answers are delivered instantly, unlike the conventional analysis process that requires setting up environments, importing necessary libraries, and writing code. 

This content guides essential basic EDA tasks, such as generating statistical summaries or visualizing data distributions, by utilizing intuitive prompts through AI tools. 

 

1. What Is Exploratory Data Analysis (EDA) 

 

Exploratory data analysis (EDA) is a crucial process in any data analysis project as it allows one to become familiar with the data before delving into complex modelling or decision-making.  

The objective of EDA is to understand the dataset and gain insights: summarizing statistical analysis, visualizing data to identify patterns, trends, correlations, outliers or missing values, and checking assumptions necessary for further analysis. 

Understanding the EDA process is vital because it helps prevent mistakes, informs feature engineering, guides model selection, and ultimately leads to more reliable insights. It serves as the foundation for data-driven decision-making. 

 

2. Why Use Prompts for Basic EDA? 

 

Leveraging prompts for basic EDA can be excellent as a starting point, while complex analysis still requires traditional methods and a deeper statistical understanding.  

  • - Speed and Efficiency: Get immediate answers to basic questions without the need to write or debug code. 

  • - Accessibility: Democratizes basic analysis without technical barriers. It empowers individuals who are not proficient coders but understand data questions. 

  • - Intuitive Interaction: Leverages natural language. 

  • - Focus on the "What," Not the "How": Encourages analysts to think critically about the questions they want to ask the data, rather than getting caught up in implementation details. 

  • - Iterative Exploration: Easily adjust prompts and ask follow-up questions based on initial results. Facilitates a rapid exploration cycle. 



3. Preparing Your Data and Tool 

 

Prerequisites: Ability to prompt through AI tools to interpret data analysis, such as AI notebooks, data analysis platforms with chat interfaces, and code-generating assistance within IDEs. 

Data Loading: Load the dataset files into AI tools (Assume the data has been cleaned). 

  • - Titanic Dataset from Kaggle  

  • - AI tool: ChatGPT 


Context is key: Ensure that the AI knows which dataset you’re referring to in subsequent prompts.  

Example dataset: Start with a simple, well-known dataset for practice, such as Iris, Titanic, or a hypothetical ‘sales_data’ with columns like ‘Region’,SalesAmount’, and ‘Age’.  


4. Prompts for Statistical Summaries 

 

📍 Types of Summaries & Prompts: 

1️⃣ Overall numerical summary: AI will extract and present numerical values from the Titanic dataset. 

Prompt Examples:  

  • - "Describe the dataset." 

  • - "Show basic statistics (mean, median, std dev, min, max, count) for all numeric fields." 

  • - Define the desired output format, such as a table or list, etc. 


👨🏾‍💻 Prompt: “Provide the statistical summary of the columns of Survived, Age, and Pclass in the dataset. Format it as a table. 

🤖 AI output 


The numerical summary table for ‘Survived’,Age’, and ‘Pclass shows count, mean, standard deviation, and min/percentile/max values. 


2️⃣ Targeted Numerical Summary for Specific Columns: 

Prompt Examples:  

  • - "What are the mean and median 'SalesAmount'?" 

  • -  "Summarize the 'Age' column." 

  • -  "Calculate the standard deviation for 'ProductPrice'." 


👩‍💻 Prompt: “What are the mean and median values of the ‘Age’ column in the dataset?” 

🤖 AI output:  

3️⃣ Categorical Summary for Counts & Frequencies: 

Prompt Examples:  

  • - "Show the unique values and their counts for the 'Region' column." 

  • - "What are the different categories in 'ProductType'? 

  • - A list of all unique categories. 

  • - The number of occurrences for each category. 

  • - A summary presented in a specific format, like a table or bullet points. 


🧑🏻‍💻 Prompt: Show the unique values and their counts for the ‘Sex’ column in the dataset. 

🤖 AI output:  

4️⃣ Grouped Summaries: 

Prompt Examples:  

  • - "Calculate the average 'SalesAmount' for each 'Region'." 

  • -  "Show the median 'Age' grouped by 'JobTitle'." 

  • - The desired output format: a table or bullet points 


👩‍💻 Prompt: Calculate the percentage of ‘Survived’ grouped by ‘Sex 

🤖 AI output: 


Counting categorical data is crucial because it provides a clear understanding of how frequently different categories appear in the dataset. It allows for identifying trends and distributions, comparing groups, handling imbalanced data, and decision-making in models.  


5. Prompts for Data Visualization? 

 

Daya visualization is a critical part of the exploratory data analysis (EDA) process because it helps uncover patterns, trends, and insights that might be hidden in raw data. It makes complex data understandable, identifies patterns and trends, detects outliers and data quality issues, and enhances decision-making. Further, it facilitates communication with audiences.  

📍Common Visualizations & Prompts: 

1. Histograms for Frequency Distribution: 

Prompt Examples:  

  • - "Plot a histogram of 'Age'.", 

  • -  "Show the distribution of 'SalesAmount'.",  

  • - "Create a histogram for 'CustomerLifetimeValue' with 15 bins." 


Prompt: Create a histogram of ‘Age’ with 25 bins. 

AI output: 

The histogram offers a granular view of age frequency, highlighting distinct age clusters and rare extreme values. 


2. Bar Plots for comparing groups with subcategories: 

A bar plot with a hue is useful in data analysis when you need to compare different categories while also breaking them down by an additional categorical variable. So, this it makes easier to spot and improves readability for subgroup comparisons.  

Prompt Examples:   

- Survival rate by passenger class in the Titanic dataset  

  •  

Prompt: Create a bar plot to show the number of people who survived based on their ‘Pclass’ and ‘Sex’. X-axis is ‘Sex’, Y-axis is ‘Survived’, and Hue is ‘Pclass’.  

AI output: 

The bar plot displays the survival rate by passenger Sex, with separate bars for each Pclass.  

  • Female: Highest survival in 1st class (~97%), 2nd (~92%), 3rd (~50%). 

  • Males: Highest survival in 1st class (~37%), 2nd (~16%), 3rd (~14%). 

 

3. Box Plots for Summary & Outliers: 

Prompt Example:  

  • - "Create a box plot for 'TestScore'."  

  • - Good for comparing groups: "Box plot of 'Salary' by 'Department'." 


Prompt: Create a box plot of the ‘Survived’ group by ‘Age.  

AI output:  


The box plot visualizes the age distributions for passengers who survived versus those who did not: 

Median (orange line): Both groups have a median age around the high 20s, with the survivors’ median slightly lower. 

Interquartile range (IQR): Survivors show a narrower age spread between the 25th and 75th percentiles. 

Whiskers & outliers: Non-survivors include older extreme ages (up to ~74), while survivors peak around 80+. 


6. Crafting Effective Prompts 

 

  • - Be Specific: Define the column names, the type of analysis desired, and the name of the dataset. 

  • - Start Simple, Iterate: Ask broad questions first, then refine with more specific follow-ups. 

  • - Use Action Verbs: "Calculate," "Plot," "Show," "Summarize," "Count," "Describe." 

  • - Provide Context: When asking follow-up questions, refer back to the previous summary to maintain clarity and continuity. 

  • - Specify Parameters: Clearly state any specific requirements, such as the number of bins for a histogram or the specific statistics needed (e.g., mean, median). 

  • - Experiment: Don't be afraid to rephrase your questions if the AI doesn't understand initially. 

 

7. Limitations and Important Considerations  

 

  • - Ambiguity: AI might misinterpret vague prompts. 

  • - Complexity: May struggle with multi-step, highly complex analysis requests in a single prompt. 

  • - "Black Box" Element: Understanding how the AI calculated something might be less transparent than code. 

  • - Critical Thinking Required: The AI provides results, but you must interpret them correctly and critically. It doesn't replace statistical knowledge. 

 

Final Thoughts: Find Hidden Gems 💎✨ with Prompts 

 

Prompt engineering provides a quick and easy way to conduct essential EDA tasks like generating statistical summaries for numerical values and visualizing distributions, patterns, and outliers.  

It facilitates the EDA process by using AI tools to ask questions and unlock new opportunities for data exploration to aid in decision-making.

Start exploring data by utilizing prompt techniques. 

"Give these prompts a try on your dataset! What insights do you discover within the first few minutes?

Feel free to share your experiences or prompt techniques in the comments below." 


🔔Subscribe for more insights!

🔹Always welcome constructive feedback or opinions. Happy reading!

🔗Connect with me:

👩‍💻 LinkedIn 👩‍💻 Kaggle 👩‍💻 Medium 👩‍💻 Instagram 👩‍💻 Substack















Comments

Popular posts from this blog

“Budgeting for Gen Z: How to Master Your Money with the Best Fintech Tools in 2025”💰

  Image created via Gemini Level Up Your Finance💸 How do you navigate your money flow, and where does it go?  You can manually track your income from your job, side hustles, student loans, or your daily spending while trying to save for your next dream trip with friends or to buy your first condo.  I want to introduce you to innovative fintech tools designed for your digital-native lifestyle that make budgeting intuitive, automated, and even fun. Budgeting may seem dull, but it’s your key to gaining financial freedom.  These budgeting apps automate the tedious work, provide real-time insights into your spending and saving, and ultimately help you reach your financial goals.  This guide delves into the best fintech apps and strategies to help you create a budget. You can track your spending in real-time and build wealth on your own terms. Let’s make it clear! Why Are Budgeting Apps Your Secret Weapon? ⚔️ According to Bank of America’s 2025 Better Money Habits fi...

“The First-Time Canadian Homebuyer’s Ultimate Checklist Before Buying Your Dream Home.”🏡

Buying your first home is a huge milestone, but it can also feel overwhelming not knowing what to do. Still, it creates joy and happiness in making a cozy space where you and your loved ones can live together. Before purchasing a home, you must consider several factors, including current housing market trends, interest rates, your financial situation, and more.  Whether you’re looking for a condo in the Toronto area or a detached home in Manitoba, being prepared with recent, accurate information can help you navigate your way in the real estate market. This guide will equip you with a step-by-step journey from preparation to the final step of buying a home. By the end of this journey, you will build confidence by understanding the process and gaining clarity to make mindful decisions. Let’s dive in.  Step 1: Assess Your Financial Readiness Before you start house-hunting, identify a clear picture of your financial health because it will determine what you can afford and how le...

“From $0 to FIRE: The Best Compound Interest Strategies for Gen Z”💰

Are you hoping to retire early in the FIRE movement but unsure how to make it happen in an unpredictable economic market with soaring living costs and an expensive housing market? Many young Canadians may feel the squeeze, but there is a secret power on your side: compound interest and time . If you leverage compounding interest with time in the right investment formula, it allows you to harness the power of your money, making more money .  With this smart strategy, your money will roll slowly and consistently, like a turtle in Aesop’s fables, towards your goal of retirement, so that you will achieve financial freedom after retirement. Think of it like this: first, you have to invest your seed money, which produces annually a small amount of interest from the capital market. Next, you need to let the money grow without withdrawing any money, and then it gradually starts growing more interest. Over a long time, your money will accumulate into big money from that initial seed money. ...