Representing Unstructured data into meaningful embeddings
Now let us first understand some basic terminology
‣Unstructured data - It is data that is not in a tabular format or that lacks schema example data of images and text.
‣Embeddings - It is a representation of any unstructured data into vector space, they are usually multidimensional.
With these terms, the idea should b pretty clear
Now let us pick text data
As we can not feed a sentence of strings to any machine learning model. Text data can be converted to a higher vector space of embeddings. Each word with context is represented into numbers or in a multidimensional array. This multidimensional representation or embeddings computation in the text can be computed by.
‣Bert
‣Glove
‣Keras embeddings layers
My thread on to clear these basic of embeddings computation and usage for text

Let's talk about images
When we read images they are just pixel representation and we get a multi-dimension array should I call it embeddings?
Maybe or maybe not
I would rather call after passing it to a pre-trained model and compute meaningful embeddings.
Meaningful embeddings mean that it should attain a property or set of values that are unique.
Computing embeddings for images by
‣Pretrained models like inception, resent (cutting layers for better dimensions)
‣Using Keras embedding layer
My thread on to clear these basics of embeddings computation and usage for images


Now the advantage of using these embeddings
‣They can be used to train a model for classifying
‣They can be used to compute cosine similarities between two data points.
‣They can be used to perform unsupervised learning that is we can perform clustering on embeddings of multiple data points and can convert them into groups of similar embeddings
Feel free to jump over to the explorations of embeddings!
Thanks for reading!