Architecture
Three-Layer Architecture
1
Perception
Encoding visual input into numerical representations
Most visual AI stops here.
2
Reasoning
Connecting, comparing, and planning across knowledge
3
Expression
Translating internal state into human language