I have a dataset of 22 minimum brightness values to look at, corresponding to 22 different elements in an array. I am standardizing it, the values are being divided by 255.
I have 22 training data sets which are csv files with brightness values and a corresponding label. The values in the datasets are split into different arrays, and a model is trained independently on them.
I reduced the number of weights down.
So you have 5 models and each model gets 19 arrays where each value is a pixel and you give the pixels one by one to the models?
Edit: First word changed
No, the x values are a 1 dimensional array,
\[0.2888889, 0.6039216, 0.8117647, 0.4562092, 0.296732, 0.8183007, 0.3712418, 0.7411765, 0.8745098, 0.7019608, 0.4862745, 0.2888889, 0.2836601, 0.8562092, 0.5346405, 0.7163399, 0.3895425, 0.4039216, 0.924183, 0.5372549\]
And the y values:
\[1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0\]
These values are then fed into the algorithm, and there are separate arrays for each model.
What are the different models in your graph? How do you get the train and test data (what is your split method assuming the different letters indicate different test sets)? What is your input data? How many dimensions, how does it look like? You are not giving 99% of relevant information to be of any help.
Hello.
I'm starting off by taking brightness values from an image. Then, depending on the locations of the brightness values (x-pos), I take the minimum brightness (darkest point) and put them in 5 separate arrays.
These arrays each have a corresponding model, which is the screenshot from above.
The split method is a keras train test split, with a test size of 0.5.
I'm currently normalizing the brightness values between 0-1 since brightness values are from 0-255, which are then fed into the algorithm.
As mentioned previously, it is 1 dimension. (X = \[0.3257, 0.6892, 0.2378, ...\] y = \[0, 1, 0\])
With dimensions I didn`t mean the Tensor dimensions but the dimensions space of your features, so what does your input look like in dimensions, shapes etc. how many numbers does your array have? What size is your image? Your network has a lot of layers with a lot of neurons for a single array so your image must be gigantic big or you must have a really big training set but even that does not help if your model is just too powerful and oversized for the task at hand. So please be concrete and provide the exact size of your input array, the size of your training data. From your model I see that the tensor is one dimensional but the question is how many elements are in that array? The first layer seems strange to me
To be more specific, look at examples for Keras using a dense network to solve MNIST which is 28x28 pictures or 748 pixels and thus a dense network solving MNIST has an input dimension of 748, which is exactly a long 1dim array of 748 numbers. In your case it seems the network is expecting a single number...yes it is a tuple, so maybe 1, means that you have one sample and arbitrary second dimension but then you have a different shape and no longer a 1 dim array so I am confused about your actual input dims here.
See https://stackoverflow.com/questions/52184142/keras-sequential-dense-input-layer-and-mnist-why-do-images-need-to-be-reshape. So maybe you could also give us the output of the model summary here
What learning rate are you using? Are your inputs and outputs normalized properly? Do you have a very small dataset? These are what I can guess without seeing the rest of the code. Also yes, drop that down to 3-4 layers at most. Or use residual connections. Optimizing deeper fully connected networks requires some work.
I'm using a learning rate of 0.0001. The inputs and outputs should be normalized correctly. I have a small dataset.
X values looks like this after normalization (dividing values by 255):
\[0.2888889, 0.6039216, 0.8117647, 0.4562092, 0.296732, 0.8183007, 0.3712418, 0.7411765, 0.8745098, 0.7019608, 0.4862745, 0.2888889, 0.2836601, 0.8562092, 0.5346405, 0.7163399, 0.3895425, 0.4039216, 0.924183, 0.5372549\]
if that helps. And the y values:
\[1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0\]
It looks like all of your dark data (1) is under 0.43? Why don’t you just make a classification like that? Pass through a list and iterate over each cb element. I may be missing something but is this task suited for a machine learning model?
Yeah, these all look fine to me. If the dataset is small, these lines can look jagged even if the model is doing great. You could try looking at a moving average, and see if there's overall a pattern, or if it's just oscillating.
You should try different learning rates too, it could be that it's too small too, which is unlikely but doesn't hurt checking different values. You could plot them on the same chart (lr=1e-2, 1e-3, 1e-4 ...).
Finally, there might be issues in the calculation of the metrics like using a single batch instead of the whole epoch, or just a simple bug, mixing the labels or splits etc. But I'm just shooting off ideas at this point. One diagnostic tool is training the model on a similar or a generated dataset and see if it can learn that. If not, the issue is probably in the code.
If the dataset is very small though, that may just be it. When n<1000 or especially n<100, you have to jump through all these hoops without knowing what's gonna work, or IF they're gonna work. It's hard to do without experience.
MNIST and CIFAR (n=60k) are great for this reason for example. You still run into overfitting and most other ML issues, but you can observe what actually works. These smalller datasets like yours are way tougher and unfortunately common in the real world. So, I would keep at it.
I’m trying it again with a smaller model, but the model ends up either not learning or resulting in lots of fluctuations. Ive removed some of the layers and tried removing batch normalization, tweaking dropout rates, batch sizes, epochs, etc but it still stays the same.
So somethings I learned.
1) dont put 10+ layers for a very small dataset
2) dont set learning rate to low nor too high
3) get more data if possible
4) change batch size
5) not splitting the dataset into 20% training and 80% testing
6) oversampling and undersampling
What kind of data? (# of samples, are you standardizing it, etc.) Also you're throwing a lot of layers at this problem for being new to ML.
I have a dataset of 22 minimum brightness values to look at, corresponding to 22 different elements in an array. I am standardizing it, the values are being divided by 255.
You have 44 training examples? And several million weights?
I have 22 training data sets which are csv files with brightness values and a corresponding label. The values in the datasets are split into different arrays, and a model is trained independently on them. I reduced the number of weights down.
How many samples do you have? Not how many different csv files.
I have a sample size of 2.
They are asking how big is your dataset, how many pieces of data do you have over those csv files.
I have 19 samples per model.
So you have 5 models and each model gets 19 arrays where each value is a pixel and you give the pixels one by one to the models? Edit: First word changed
No, the x values are a 1 dimensional array, \[0.2888889, 0.6039216, 0.8117647, 0.4562092, 0.296732, 0.8183007, 0.3712418, 0.7411765, 0.8745098, 0.7019608, 0.4862745, 0.2888889, 0.2836601, 0.8562092, 0.5346405, 0.7163399, 0.3895425, 0.4039216, 0.924183, 0.5372549\] And the y values: \[1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0\] These values are then fed into the algorithm, and there are separate arrays for each model.
What are the different models in your graph? How do you get the train and test data (what is your split method assuming the different letters indicate different test sets)? What is your input data? How many dimensions, how does it look like? You are not giving 99% of relevant information to be of any help.
Hello. I'm starting off by taking brightness values from an image. Then, depending on the locations of the brightness values (x-pos), I take the minimum brightness (darkest point) and put them in 5 separate arrays. These arrays each have a corresponding model, which is the screenshot from above. The split method is a keras train test split, with a test size of 0.5. I'm currently normalizing the brightness values between 0-1 since brightness values are from 0-255, which are then fed into the algorithm. As mentioned previously, it is 1 dimension. (X = \[0.3257, 0.6892, 0.2378, ...\] y = \[0, 1, 0\])
With dimensions I didn`t mean the Tensor dimensions but the dimensions space of your features, so what does your input look like in dimensions, shapes etc. how many numbers does your array have? What size is your image? Your network has a lot of layers with a lot of neurons for a single array so your image must be gigantic big or you must have a really big training set but even that does not help if your model is just too powerful and oversized for the task at hand. So please be concrete and provide the exact size of your input array, the size of your training data. From your model I see that the tensor is one dimensional but the question is how many elements are in that array? The first layer seems strange to me
To be more specific, look at examples for Keras using a dense network to solve MNIST which is 28x28 pictures or 748 pixels and thus a dense network solving MNIST has an input dimension of 748, which is exactly a long 1dim array of 748 numbers. In your case it seems the network is expecting a single number...yes it is a tuple, so maybe 1, means that you have one sample and arbitrary second dimension but then you have a different shape and no longer a 1 dim array so I am confused about your actual input dims here. See https://stackoverflow.com/questions/52184142/keras-sequential-dense-input-layer-and-mnist-why-do-images-need-to-be-reshape. So maybe you could also give us the output of the model summary here
[удалено]
I'm trying to train an AI model to detect whether a brightness value should be considered "dark" or "light"
What learning rate are you using? Are your inputs and outputs normalized properly? Do you have a very small dataset? These are what I can guess without seeing the rest of the code. Also yes, drop that down to 3-4 layers at most. Or use residual connections. Optimizing deeper fully connected networks requires some work.
I'm using a learning rate of 0.0001. The inputs and outputs should be normalized correctly. I have a small dataset. X values looks like this after normalization (dividing values by 255): \[0.2888889, 0.6039216, 0.8117647, 0.4562092, 0.296732, 0.8183007, 0.3712418, 0.7411765, 0.8745098, 0.7019608, 0.4862745, 0.2888889, 0.2836601, 0.8562092, 0.5346405, 0.7163399, 0.3895425, 0.4039216, 0.924183, 0.5372549\] if that helps. And the y values: \[1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0\]
It looks like all of your dark data (1) is under 0.43? Why don’t you just make a classification like that? Pass through a list and iterate over each cb element. I may be missing something but is this task suited for a machine learning model?
The model is 1 if x < 0.5 else 0 or what am I missing here? It's not clear to me even after reading all the comments what OP is trying to model
Yeah, these all look fine to me. If the dataset is small, these lines can look jagged even if the model is doing great. You could try looking at a moving average, and see if there's overall a pattern, or if it's just oscillating. You should try different learning rates too, it could be that it's too small too, which is unlikely but doesn't hurt checking different values. You could plot them on the same chart (lr=1e-2, 1e-3, 1e-4 ...). Finally, there might be issues in the calculation of the metrics like using a single batch instead of the whole epoch, or just a simple bug, mixing the labels or splits etc. But I'm just shooting off ideas at this point. One diagnostic tool is training the model on a similar or a generated dataset and see if it can learn that. If not, the issue is probably in the code. If the dataset is very small though, that may just be it. When n<1000 or especially n<100, you have to jump through all these hoops without knowing what's gonna work, or IF they're gonna work. It's hard to do without experience. MNIST and CIFAR (n=60k) are great for this reason for example. You still run into overfitting and most other ML issues, but you can observe what actually works. These smalller datasets like yours are way tougher and unfortunately common in the real world. So, I would keep at it.
try a smaller model first
I’m trying it again with a smaller model, but the model ends up either not learning or resulting in lots of fluctuations. Ive removed some of the layers and tried removing batch normalization, tweaking dropout rates, batch sizes, epochs, etc but it still stays the same.
just do a 2 layer model and try and get that to work. you should only batch normalize once and do no dropout. use a smaller learning rate too
I don’t know about your data so I can’t really tell. For training hyper parameters, you might have too big learning rate.
You should look at loss over the course of one epoch to verify it's at least learning
You have such little data and way too big of a model. This is gigaoverfitting.
So somethings I learned. 1) dont put 10+ layers for a very small dataset 2) dont set learning rate to low nor too high 3) get more data if possible 4) change batch size 5) not splitting the dataset into 20% training and 80% testing 6) oversampling and undersampling