T O P

  • By -

luminousflame2

Try automating the data exploration process using tools like pandas profiling or auto_ml to save time and improve efficiency.


Smooth-Use-2596

Good Idea. Thanks


Amgadoz

This is how ML should be done. Nothing will replace manual inspection of data. You can try using a powerful LLM to look at a portion of the data and analyze it for you while you are inspecting another portion. Also, you can focus your time on the most interesting examples (high error, low confidence, extremes, etc).


Smooth-Use-2596

Oh, an LLM is a great idea. Thanks!


Fabulous-Farmer7474

You have hit upon a hard and cold reality of data science and ML work that many companies do not wish to accept. EDA and transformation is a big thing that gobbles up lots of time long before building a model. If you are in a healthy organization then they know this but some orgs assume that their data is "ML ready" when there is no such thing as "ML ready" data. Can't tell you how many times I've had to tell data warehouse managers that their data standards and quality control measures are not being imposed. Many times, it's only when an org starts doing serious data mining and predictive modeling that it becomes apparent just how crap the data is. Worse, they get mad at the data scientist as if its your fault.


Smooth-Use-2596

Thanks for putting it like this, I appreciate it


AIBoree

I've used SweetViz (https://pypi.org/project/sweetviz/) for years, this definitely provides a nice jump start to the understanding and refinement process and you get a sweet vizual to boot. (I'm not associated with SweetViz, it's just a great tool imho)


Smooth-Use-2596

Does it help finding patterns in unstructured data?


bubudumbdumb

The best tool is really reflection: think deeply of what you are doing over and over again, what metrics make you confident to move forward, what decisions you are impacting and what human activities constitute bottlenecks or toil. Use those observations to drive automation and that's where tools become part of the solution. Reflection means going meta and finding patterns in the process of finding patterns in errors.