Overview
- Researchers started with the open-source StarChat-Beta model to generate SwiftUI programs from UI descriptions and then applied a two-stage validation loop to filter outputs.
- Each round of generation and validation added cleaner examples to the training set, yielding about 996,000 distinct SwiftUI programs after five iterations.
- UICoder significantly outperformed its StarChat-Beta base on automated metrics and human evaluations and matched or exceeded GPT-4 in compilation success.
- The team found that StarChat-Beta’s original training data lacked almost all SwiftUI content because Swift repositories were excluded from TheStack and OpenAssistant-Guanaco.
- Authors suggest the automated synthetic-data and feedback-driven finetuning method could extend to other programming languages and UI toolkits.