About Us
Services
Space delivers data-driven insights, streamlined processes, and exceptional user experiences that drive success through services like data analytics, Generative AI, Generative AI, process automation, and UX/UI design.

Empower your teams to make smarter decisions.

Gain a competitive edge and optimize your business.
Witness the impact of our AI service on a business's success.

Streamline your data processes for maximum efficiency.

Elevate your brand with our innovative designs.

Witness the impact of our AI service on a business's success.

Maximize product impact with our dedicated management services.

Achieve your project goals with our expert support Team.
Products
Space Products leverage cutting-edge technologies to streamline workflows, reduce errors, and empower data-driven decision-making. Experience the future of business automation.

Transform your business operations with Space Products.
- sharkvision
- Talliant
Witness the impact of our AI service on a business's success.

Streamline your data processes for maximum efficiency.
Industries
Space partners with businesses across a wide range of industries to optimize operations and drive success through data-driven insights and streamlined processes.

Explore our industry solutions and discover how we can help your business thrive.
Healthcare

Education

Banking

Manufacturing

Finance

Insurance

Transportation and Logistics

Construction

Resources
Access valuable insights and resources to empower your business decisions.

Space provides expert guidance, industry knowledge, and thought leadership to help you navigate the digital landscape.
- Case Studies
- Blogs
- FAQs
- Careers
SmartRep Solution

Intelligence Center

Anomaly Explorer

A Multi-Agent Chatbot

Contact Sales

Scroll down

Data Engineering

Microsoft Florence-2

A Tiny Titan in Computer Vision

By Venkata Sai Santosh

Microsoft Florence-2

Microsoft has launched its new vision foundation model called Florence which is trained to be a (UNIFIED REPRESENTATION FOR A VARIETY OF VISION TASKS) #Generalist Model that can be fine-tuned for multiple tasks like Segmentation, Object Detection , O.C.R, Image Description etc. Despite of being a small model it does a pretty good job in beating larger models like flamingo 80B in zero shot tasks.

Here is an example of models output for object detection where i have asked the model with a task to perform object detection for wheels Here is the result.

As we can notice the model has correctly identified the region where wheels are present in the Image and it has also given out the bounding boxes in the outputs including labels and bounding boxes which is quiet amazing.

Here is the link of hugging face spaces to use this model [Florence-2]

How the Florence Model is Built

Microsoft puts this in a 2d view were on x axis it represents spatial hierarchy and on y axis it represents Semantic Granularity.

X-Axis (Spatial Hierarchy) which represents how deeper are you going into image from left to right the complex is the task ie Image Level (classification Tasks), Region Level(Object detection Tasks) and pixel level(Image Segmentation Tasks)

Y-Axis (semantic granularity )ie None(No semantic), coarse semantic(Level 1 captioning and description as shown in image simply stating Person, car, road), Fine grained semantic(Level 2 captioning a very detailed explanation of the image as shown in figure)

So the model is trained to perform across all these tasks as a part of uniform representation for a variety of tasks which focuses on replacing current large vision models which perform very well on certain transfer learning problem statements but fail to excel in handling all the tasks across the hierarchy.

Data-set Preparation

To help Florence excel in performing multiple tasks, Microsoft has created a high quality annotated dataset. ie FLD 5B 5.4 billion image annotations on 126 billion image dataset.

Florence Data Engine:

Step-1: Microsoft has generated annotations for images with the help of existing models that are specialist in a particular task like azure ocr api for Ocr , for object detection a specialist model etc.

Step-2: A Multifaceted filtering process to refine and eliminate undesired annotations. Our general filtering protocol mainly focuses on two data types in the annotations: text and region data

Step-3: They trained a multitask model adopt at processing sequential data. Evaluating this model against the training images revealed significant improvements in its predictions, especially for cases where original labels contained inaccuracies or irrelevant information, common in alt-texts. Inspired by these promising results, They incorporated these refined annotations alongside their originals

MODEL ARCHITECTURE

As shown in the image image will be passed into a vision encoder and alongside image a multitask prompts are passed into a Multi modal encoder which returns language token embeddings and further these image and language token embeddings are concatenated and passed into transformer encoder and decoder blocks and then a final outcome is generated a text plus location tokens.

Given the input x combined from the image and the prompt, and the target y, they have used a standard language Modelling with cross-entropy loss for all the tasks.

Conclusion

kudos to Microsoft Team to built a model Despite of being smaller in size its performance is absolutely Impressive and comparable with other large vision models , I see a huge potential of further fine tuning this model for custom tasks and use this as a replacement for much larger computer vision models which Reduces cost and Inference Time with a good quality results. References

Research Paper [https://arxiv.org/pdf/2311.06242]

By Venkata Sai Santosh

Senior Data Scientist

Read other blogs

Your go-to resource for IT knowledge. Explore our blog for practical advice and industry updates.

Latest Practices in HR and IT Recruitment

A Deeper Understanding on HR and IT Recruitment

Read Blog
Harnessing Generative AI

A Strategic Approach for Business Transformation.

Read Blog
You See How AI is Changing the Marketing Game?

AI in Marketing Strategy.

Read Blog
Evolving into Space Inventive

Unveiling Our New Look and Enhanced Digital Experience

Read Blog

Discover valuable insights and expert advice.

Uncover valuable insights and stay ahead of the curve by subscribing to our newsletter.

Space Inventive | Powered by SpaceAI

Welcome to Space Inventive!

Space Bot is typing

This website uses cookies

We use cookies to personalise content and ads, to provide social media features and to analyse our traffic. We also share information about your use of our site with our social media, advertising and analytics partners who may combine it with other information that you’ve provided to them or that they’ve collected from your use of their services.

Deny

Allow

Data Engineering

Microsoft Florence-2

By Venkata Sai Santosh

Table of Contents

Microsoft Florence-2

How the Florence Model is Built

Data-set Preparation

Florence Data Engine:

MODEL ARCHITECTURE

Conclusion

Read More

By Venkata Sai Santosh

Read other blogs

Latest Practices in HR and IT Recruitment

Harnessing Generative AI

You See How AI is Changing the Marketing Game?

Evolving into Space Inventive

Discover valuable insights and expert advice.

COMPANY

RESOURCES

SERVICES

PRODUCTS

INDUSTRIES

Social

Space Inventive | Powered by SpaceAI

Welcome to Space Inventive!

Data Engineering

Microsoft Florence-2

By Venkata Sai Santosh

Table of Contents

Microsoft Florence-2

How the Florence Model is Built

Data-set Preparation

Florence Data Engine:

MODEL ARCHITECTURE

Conclusion

Read More

By Venkata Sai Santosh

Read other blogs

Latest Practices in HR and IT Recruitment

Harnessing Generative AI

You See How AI is Changing the Marketing Game?

Evolving into Space Inventive

Discover valuable insights and expert advice.

Download Our Latest Industry Report

COMPANY

RESOURCES

SERVICES

PRODUCTS

INDUSTRIES

Social

Space Inventive | Powered by SpaceAI

Welcome to Space Inventive!