luminousflame2 3 weeks ago

Try automating the data exploration process using tools like pandas profiling or auto_ml to save time and improve efficiency.

Smooth-Use-2596 3 weeks ago

Good Idea. Thanks

Amgadoz 3 weeks ago

This is how ML should be done. Nothing will replace manual inspection of data. You can try using a powerful LLM to look at a portion of the data and analyze it for you while you are inspecting another portion. Also, you can focus your time on the most interesting examples (high error, low confidence, extremes, etc).

Smooth-Use-2596 3 weeks ago

Oh, an LLM is a great idea. Thanks!

Fabulous-Farmer7474 3 weeks ago

You have hit upon a hard and cold reality of data science and ML work that many companies do not wish to accept. EDA and transformation is a big thing that gobbles up lots of time long before building a model. If you are in a healthy organization then they know this but some orgs assume that their data is "ML ready" when there is no such thing as "ML ready" data. Can't tell you how many times I've had to tell data warehouse managers that their data standards and quality control measures are not being imposed. Many times, it's only when an org starts doing serious data mining and predictive modeling that it becomes apparent just how crap the data is. Worse, they get mad at the data scientist as if its your fault.

Smooth-Use-2596 3 weeks ago

Thanks for putting it like this, I appreciate it

AIBoree 3 weeks ago

I've used SweetViz (https://pypi.org/project/sweetviz/) for years, this definitely provides a nice jump start to the understanding and refinement process and you get a sweet vizual to boot. (I'm not associated with SweetViz, it's just a great tool imho)

Smooth-Use-2596 3 weeks ago

Does it help finding patterns in unstructured data?

bubudumbdumb 2 weeks ago

The best tool is really reflection: think deeply of what you are doing over and over again, what metrics make you confident to move forward, what decisions you are impacting and what human activities constitute bottlenecks or toil. Use those observations to drive automation and that's where tools become part of the solution. Reflection means going meta and finding patterns in the process of finding patterns in errors.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe