Powered by GPT-4V, the framework takes screenshots as input and outputs mouse clicks and keyboard commands, just as a human would.
View Article on VentureBeat
AI,Business,AI, ML and Deep Learning,category-/Business & Industrial,category-/Science/Computer Science,ChatGPT,Conversational AI,GPT-3,GPT-4,GPT-4 Vision,NLP,OpenAI
category-/Business & Industrial