YDT Blog

In the YDT blog you'll find the latest news about the community, tutorials, helpful resources and much more! React to the news with the emotion stickers and have fun!
Bottom image

[D] How do you come up with your proofs?

I am a grad student struggling to produce a quality and elegant research paper. That usually requires writing a theorem or two with proofs and theoretical analysis. When I read ICML or NeurIPS papers I wonder how these research groups get their proof ideas.

I do not struggle with the proof itself but mainly "what should I prove". Also, what is the best strategy to start well-founded research? Do you start with the theory and build experiments to validate your theoretical hypothesis or just run multiple experiments and when it works you try to come up with some theoretical explanation? I think the first option sounds better and safer but to find the right theorem or theoretical basis to start with is very hard.

Any feedback on this or experience sharing is very welcome and Thank you.

submitted by /u/samikhenissi
[link] [comments]

Source: Reddit Machine Learning

[Project] ASAG appropriate dataset

Dear reader,

I would like to find a public dataset that includes at least question with 100+ answers per question, but preferably on the higher side. These questions have to be open (not essays) and graded. This is part of some project on ASAG (automatic short answer grading).

An example would be:

What do you know of the German reunification?

Possible answers would be:

– the process in 1990 in which the German Democratic Republic became part of the Federal Republic of Germany to form the reunited nation of Germany. (1 point)

– West germans annexation of east germany in 1989. (0 points)

Unfortunately I have only been able to find datasets which include many questions with only 1 to 30 answers to each question.

Any help would be grately appreciated.

submitted by /u/realIkbenoud1
[link] [comments]

Source: Reddit Machine Learning

How Data Analysis Informs Our Understanding of Opioid Abuse

Despite their strong pain-relieving properties, opioids have become a scourge on society in recent years. Opioids, the class of drugs either naturally or synthetically derived from the opium poppy, bind to the body’s opioid receptors in order to relieve pain. And they’re one the world’s most highly addictive substances.As early as 2014, the misuse of opioids reached epidemic proportions in the United States. That year, more than 28,000 people died from an overdose of either heroin or prescription opioids. A further 2 million Americans abuse or are dependent on prescription opioids, according to Recovery Centers of America. These numbers and similar data may hold the key to finding tangible solutions to the opioid epidemic. Our understanding of opioid abuse, as well as its causes and effects, is partially informed by data analysis. Using relevant public health information, data analysis can pinpoint opioid use trends and patterns. Via the objective analysis of opioid-related data, potential large-scale treatment methods may also begin to emerge. In this way, data analytics is among the myriad modern technological innovations helping to foster change on a global scale, improving our lives and overall public health. Opioid Data Analysis: What We KnowOn the opioid front, data analysis …

Read More on Datafloq

Source: Datafloq

Model Validation and Cost in Industry

I was watching an industry presentation on youtube and the speaker mentioned two things that stumbled me:

  1. They monitor the performance of multiple models in real-time. They use the results to assess when to re-train models.

How do you monitor performance in real-time? Let's use for example a model that predicts housing prices? You wouldn't get feedback on the model until after the house sold, which can be a month, 2 month or years from today.

Is there another mechanism that can be used as a proxy for performance that I'm missing?

2) They take that into account cost when developing machine learning models. For example, what's the business of a false positive and the cost for a false negative. buildings

I actually never thought of it that way, and it makes sense. In the case of fraud, a false negative is was more costly than a false positive. What are some of the common approaches when it comes to assessing model cost?

submitted by /u/da_chosen1
[link] [comments]

Source: Reddit Data Science

Latest Posts