Published in Towards Data Science·PinnedThe Only 30 Methods You Should Master To Become A Pandas ProAfter using pandas for over three years, here are the 30 methods I have used almost all the time — Pandas is undoubtedly one of the best libraries ever built in Python for tabular data-wrangling and processing tasks. Being open-source, numerous developers from different parts of the world have contributed to its development and brought it to where it is today — supporting hundreds of methods for various tasks.Data Science6 min readData Science6 min read
Published in Geek Culture·Pinned450+ Practice Questions That Will Make You a Pandas, NumPy, and SQL ProA self-curated collection of practice questions to improve your Data Science Skills. — Pandas, NumPy, and SQL undoubtedly sit at the core of every data science project. These tools are indispensable to the entire development life cycle of a data-driven project, making them an essential skill to possess to begin/maintain a career in data science.Data Science5 min readData Science5 min read
Published in Towards Data Science·PinnedFive Killer Optimization Techniques Every Pandas User Should KnowA step towards data analysis run-time optimization — The motivation to design and build real-world applicable machine learning models has always intrigued Data Scientists to leverage optimized, efficient, and accurate methods at scale. Optimization plays a foundational role in sustainably delivering real-world and user-facing software solutions. While I understand that not everyone is building solutions at scale, awareness…Data Science9 min readData Science9 min read
Published in Towards Data Science·Pinned20% of Pandas Functions that Data Scientists Use 80% of the TimePutting Pareto’s Principle to work on the Pandas library — Mastering an entire Python library like Pandas can be challenging for anyone. However, if we take a step back and think, do we really need to be aware of every minute detail of a specific library, especially when we live in a world governed by Pareto’s Principle? …Pandas5 min readPandas5 min read
Published in Towards Data Science·PinnedWhy I Stopped Dumping DataFrames to a CSV and Why You Should TooIt’s time to say goodbye to pd.to_csv() and pd.read_csv() — Building an end-to-end data-driven pipeline is challenging and demanding. Having been there myself, the process is extremely tedious, and one may inevitably end up with numerous intermediate files. …Pandas4 min readPandas4 min read
Published in Towards Data Science·1 day agoIntroducing PivotUI: Never Use Pandas To GroupBy and Pivot Your Data AgainSimplifying data analysis for everyone — Motivation Pivoting and Grouping operations are fundamental to every typical tabular data analysis process. The pivot_table() and groupby() method stands among one of the most commonly used methods in Pandas. Used primarily for understanding categorical data, Grouping lets you compute statistics for individual groups in the data.Data Science5 min readData Science5 min read
Published in Geek Culture·6 days agoVoice-Assisted Image Generation With Stable DiffusionA voice-assisted app to generate images from speech — Ever since text-to-image models such as DALL-E, DALL-E2, and Google Imagen have shown breakthroughs by generating astonishing and realistic images just from a textual prompt, there has been an increasing interest among users to test these models themselves. A few months back, Stable Diffusion fulfilled this desire and open-sourced a…Artificial Intelligence6 min readArtificial Intelligence6 min read
Published in Towards Data Science·Nov 22Never Worry About Optimization. Process GBs of Tabular Data 25x Faster With No-Code PandasNo more run-time and memory optimization, let’s get straight to work — Motivation Pandas makes the tasks of analyzing tabular datasets an absolute breeze. The sleek API design offers a wide range of functionalities that covers almost every tabular data use case. However, it’s only when someone transitions towards scale that they experience the profound limitations of Pandas. …Data Science6 min readData Science6 min read
Published in Towards Data Science·Nov 21Introducing Reloading: Never Re-Run Your Python Code Again To Print More DetailsModify your code during run-time and save hours of work time — Motivation While running Python scripts, I have often found myself in situations where I forgot to print all the necessary details to track the pipeline’s progress. This is typically observed in training machine learning models. More often than not, folks (including me) often forget to: Add necessary logging details. Print essential…Data Science4 min readData Science4 min read
Published in Towards Data Science·Nov 9Two Killer Jupyter Hacks That Are Guaranteed To Save You Hours Of Work TimeThe Moment You Start Using Them — Jupyter Notebooks, because of their simple, streamlined, beginner-friendly, and sleek design, are almost indispensable to any Python-oriented task today. Thinking retrospectively, I cannot even imagine my life without an Interactive Python (IPython) tool like Jupyter.Data Science4 min readData Science4 min read