Introduction#
This article will analyze the technical goals and three evolutionary directions of cross-end framework technology, followed by a revelation of the technical implementation of self-drawn cross-end solutions in the industry—including solutions like Kun, WebF, TDF, Weex 2.0, Waft, and MiniX. It will analyze their characteristics and shortcomings, summarize the research and development ideas and technical points of cross-end frameworks, and ultimately share thoughts on the development trends of cross-end frameworks.
During the sharing process, there will be an introduction to the selection and technical difficulties of the script engines for cross-end frameworks, the dilemmas faced by various cross-end frameworks in the industry, sharing of Debugger principles, and the engineering approach of one core with multiple ecosystems.
The outline is as follows:
- Technical goals of cross-end frameworks (brief)
- Technical directions of cross-end frameworks
- Technical revelations of cross-end frameworks
- Technical points of cross-end frameworks
- Development trends of cross-end frameworks (brief)
Note 1: This article is a textual summary of an offline sharing session, omitting the background information (such as Sections 1 and 5) and some technical details (such as parts of Sections 3-4), retaining only the core content.
Note 2: The materials for this article are sourced from public sharing on the cross-end theme at the GMTC conference, public WeChat articles from some companies, publicly available framework source code, and personal historical sharing materials, with other internal sharing and internal framework materials desensitized.
Technical Directions of Cross-End Frameworks#
I have summarized that there are four technical directions for cross-end frameworks.
Direction 1: WebView-Based Enhancements#
WebView-based enhancements are a technical direction that leans towards the client side from the front end, meaning that the upper layer ecosystem still uses web technologies but requires the client to supplement some capabilities for WebView.
For example:
- Hybrid frameworks like Ionic and Cordova.
- Numerous mini-programs based on WebView in the industry.
- Solutions like Sonic that pre-load WebView resources on the client side.
- Offline package solutions developed by app vendors for the web.
These types of frameworks have the following characteristics:
- Rendered based on WebView but supplemented with some native capability enhancements.
- The development ecosystem is based on the web front-end ecosystem (strictly speaking, mini-programs are also included).
- They strive to enhance the user experience of the web.
Direction 2: DSL-Based Native Enhancements#
DSL-based native enhancements are a technical direction that leans towards the front end from the client side, meaning that the development ecosystem is based on native, but the framework design references some characteristics of WebView. These frameworks generally define a DSL to achieve cross-end and dynamic capabilities.
For example, the accessibility framework DinamicX from Taobao, Meituan's MTFlexbox, and Alibaba's Tangram (a tangram).
Browsing the documentation of these frameworks reveals that their design is quite similar to React Native, except that developers need to use DSL to access components.
Direction 3: Code Sharing#
Code sharing is a terminal development solution, with the well-known solution in the industry being KMM (Kotlin Multiplatform Mobile), which compiles Kotlin source code into target code for various platforms through the K2 compiler, thus achieving cross-end functionality. Specifically, Kotlin generates FIR (Frontend Intermediate Representation) with semantic information through the compiler frontend, which is then handed over to the backend of each platform's compiler for optimization and generation, such as JVM/LLVM, ultimately producing executable target code for each platform.
The engineering structure of KMM is also quite simple, consisting of cross-end code (Shared module) and shell projects (Android/iOS App).
Currently, this solution has some shortcomings in generating iOS code, particularly in handling multi-threading logic, which requires optimization from the business side.
Direction 4: GPL-Based Native Enhancements#
I view GPL-based native enhancements as a development solution for large terminals. The so-called large terminal is a product of integration; in the early PC era, developers used .NET, JVM, and Qt to develop desktop applications, which we referred to as terminal development. Subsequently, with the advent of the mobile era, native client development continuously pursued dynamic capabilities, while cross-platform front-end development pursued performance. These two paths ultimately merged into the cross-end frameworks we see today. Whether it's mini-programs, Flutter, DSL development frameworks, or WASM, they all belong to the products of integrated evolution.
To summarize, Direction 4 includes the following types of solutions:
- Native rendering components: such as React Native / Hippy 1.0 / Weex 1.0, etc.
- Self-drawn engines: Flutter. For now, let's consider it separately.
- Self-drawn frameworks based on Flutter: there are many frameworks in the industry (previously organized over 20 frameworks), such as WebF (Kraken), Kun, FMP, mini-programs based on Skyline, etc.
- Self-drawn frameworks based on system graphics libraries (Skia / Vulkan / Metal / OpenGL): there are also many frameworks in the industry, such as TDF, Hippy 3.0, Weex 2.0, Waft, etc. Strictly speaking, Flutter also belongs to this category.
The first three directions and the first two types of frameworks in the fourth direction are open-source, and there are many articles in the industry that introduce their principles, so I won't elaborate further. This article mainly introduces the technical solutions of the last two types of frameworks in the fourth direction.
Technical Revelations of Cross-End Frameworks#
This section will select several representative frameworks for technical revelations:
- Kun
- WebF
- Weex 2.0
- TDF (to be desensitized, omitted)
- Waft
- MiniX (omitted)
The reason for selecting these frameworks among many is that their solutions are typical in the field, focusing on the application development system, script engine, and rendering engine selection of these frameworks.
Kun#
Kun is a cross-end framework developed by Xianyu based on Flutter, which is currently not open-sourced. The only articles available online for learning are three published on the Xianyu public account. The architectural design is relatively simple; even without the source code, one can analyze it to some extent. I won't include the architecture diagram here; interested readers can refer to the articles for more information.
The overall design idea of Kun is to develop a JS Runtime based on Flutter, where developers use the front-end ecosystem for page development, and the JS interpreter acts as a glue layer that translates the source code into Flutter Widgets, which are then rendered by the Flutter Engine.
They adopted QuickJS as the JS engine, but it is speculated that it should be Alibaba's internal QKing engine (based on QuickJS).
Debugger supports CDP, and testing is based on Flutter Golden test.
For CSS parsing, they first used Yoga as a polyfill to process styles into CSS in JS, and then the parsing module borrowed Kraken's legacy CSSLib. They used Dart FFI to pass the inline styles tested in JS to the Dart side for processing, ultimately parsing them into Flutter Widgets.
However, the CSS box model and document flow are fundamentally incompatible with the style standards of Flutter Widgets. They adopted a widget stitching approach, where each layer of the widget specifically handles certain styles, ultimately achieving component styles through a nested stitching method. The following diagram illustrates the stitching method corresponding to a div element:
To summarize the characteristics:
- Does not support complete W3C standards (which is impossible, for example, CSS in JS cannot implement pseudo-classes), only supports subsets of various standards, including: HTML tags, CSS style sets, WebAPI standards.
- Provides some custom Element components implemented on the Dart side, allowing the business side to also develop some custom Elements using Dart.
- The implementation of components adopts a widget stitching approach.
References for this section:
- New Species in the Large Terminal Field - KUN
- The First Major Test of the Third Generation Terminal Container KUN【Architectural Evolution】
- Double Eleven|Exploring KUN's Loading Performance and Enhanced Experience
WebF#
WebF was originally Alibaba's Kraken. After the team disbanded, some of the legacy was handed over to Kun, while the remaining members left to establish openwebf in the open-source community, renaming Kraken to WebF for continued maintenance.
Here is the architecture diagram of WebF:
As can be seen, unlike Kun, in addition to providing JSBinding, the team also developed some capabilities on the Dart side of Flutter, enriching the capabilities of RenderObject to meet W3C standards—implementing CSS at the Dart layer and WebAPI at the C++ layer, aligning with W3C standards.
The script engine is still QuickJS, but some optimizations have been made, which are worth learning from.
Comparing Kun and WebF, we can see that they adopt two different approaches to handling CSS.
First, let's talk about Kun; its solution has several issues:
- A single rendering link has two Layouts, which is completely unnecessary, and the update frequency of Layout itself is very high, leading to additional performance overhead.
- Dart FFI is insufficient to support the information transfer for style updates; the data volume for style updates is large, which will hit the bottleneck of FFI.
- The development experience for inline styles is poor, and many CSS properties cannot be implemented.
So how should CSS be implemented? There are two better solutions:
- Implement CSS at the Dart layer, relying on RenderObject's Layout for style updates, without going through FFI.
- Implement both DOM and CSS in C++, separating the Dart layer.
Solution 1 is WebF, while Solution 2 is represented by Weex 2.0 and TDF, among others.
However, Solution 1 also has technical difficulties because introducing CSS makes it difficult to maintain the RenderObject Tree. So how should we manage the RenderObject Tree? There are also two approaches:
- Make RenderObject thin: that is, Flutter Widgets serve as atomic rendering components without modifying RenderObject, and the upper layer combines RenderObject to achieve complex functions and styles, just like Kun.
- Make RenderObject thick: integrate a large amount of layout rendering capabilities into one, with the upper layer driving RenderObject rendering through stylesheets.
It is evident that making RenderObject thick would be a better solution, as the complexity of the former is too high (as can be intuitively felt from the previously nested code). Each style rule's calculation requires layer-by-layer checking and inference, leading to decreased maintenance efficiency.
Therefore, I am more optimistic about WebF's solution, and WebF is currently the only one among many cross-end frameworks that embraces open source. I encourage interested individuals to join TSC to co-build.
References for this section:
- https://github.com/openwebf/webf
- Shenghuai: "How WebF Achieves High Performance in Flutter + Web Integration" (2022 QCon)
Weex 2.0#
Weex 2.0 is Alibaba's internally open-sourced cross-end solution, which has basically realized Alibaba's internal one-core multi-ecosystem system. The technical architecture has completely rebuilt version 1.0, during which they explored many paths. From the sharing, the entire solution is quite complete and involves a significant amount of work.
Here is the structure diagram of Weex 2.0:
Let’s focus on introducing these components:
- WeexAbility: Container and capability extension, URL interception, caching, basic API, third-party extensions, etc.
- WeexFramework: General basic framework. Encapsulates page instances, implements DOM, CSSOM, WebAPI standards, and decouples the script engine from the rendering engine.
- QKing: Script engine, a modified version based on QuickJS.
- Unicorn: Self-drawn rendering engine. Implements CSS capabilities, including complete node construction, animations, gestures, layout, drawing, composition, and rasterization rendering pipelines, capable of cross-platform functionality.
- WeexUIKit: Native UI rendering engine, encapsulating native components.
The source code of version 2.0, like the previous frameworks, is based on jsbundle output bytecode, but the compilation has undergone some SSA optimizations. Additionally, the JS runtime has also undergone many optimizations, with the entire chain developed in C++, resulting in no extra communication overhead, no redundant abstractions, and a shorter chain. At the same time, based on the self-developed Unicorn, it features a streamlined layout algorithm, precise control of gestures and animations, directly interfacing with the system graphics library. The entire solution is unrelated to version 1.0, addressing the cross-language communication issues, dual-end rendering differences, layout algorithm issues, and script execution efficiency issues present in version 1.0.
Based on Weex 2.0, Alibaba has resolved the issues of chimney-style solutions, promoting the unification of basic capabilities based on a multi-core homogeneous kernel to support differentiated business scenarios:
References for this section:
- Men Liu: "The Architecture and Practice of Taobao's Next-Generation Self-Drawn Rendering Engine" (2023 GMTC)
Note: Tencent's TDF is also working on similar efforts, which will not be introduced here due to desensitization.
Waft#
Waft, fully known as WebAssembly Framework for Things, is a self-drawn framework developed by the Tmall Genie team based on WebAssembly Runtime and Skia, which is not open-sourced. Although it currently does not implement a framework and only supports AIoT scenarios, its principles can be cross-end, so it is introduced here to broaden perspectives.
The Tmall Genie team had some early attempts in AIoT, initially developing an Android App, but due to low memory (only a few hundred megabytes), performance was limited. They later developed cloud applications, which were effective but had high server costs, leading to a halt. They then continued to explore terminal rendering and developed Waft.
Here is the architecture diagram of Waft:
They also redesigned the loading and rendering processes:
It can be seen that the overall workload is quite large and does not align with front-end standards and ecosystems.
For the script engine selection of WebAssembly, they provided a comparison chart:
I have some doubts about the selection of this script engine, and I think there are the following shortcomings (they may have other considerations internally):
- The use case of fib is too simple to fully leverage the advantages of the JS engine.
- Comparing AOT with interpreted execution is clearly unfair.
- QuickJS should use the original version, which still has significant optimization potential.
- They did not compare with other engines that have JIT modes, such as V8 and JSCore.
- There is no explanation of what wasm framework was used, as different wasm implementations have varying performance characteristics, some focusing on interpreted execution efficiency, while others emphasize AOT/JIT efficiency.
Waft itself also has issues, and I hope they can optimize in the future:
- CSS only supports a partial subset.
- W3C standards (DOM Element, WebAPI) implementations are lacking.
- The package size may be relatively large, which is still uncertain.
Thus, the current implementation of Waft limits its application scenarios, and it can only support simple IoT pages for now.
References:
- Nie Xinxin: "Waft: A Development Framework for AIoT Applications Based on WebAssembly and Skia" (2023 GMTC)
Technical Points of Cross-End Frameworks#
Dynamic Capabilities#
Having introduced the above frameworks, we can summarize the application scenarios of cross-end frameworks:
- Dynamic capabilities
- IoT
- Desktop
- In-car systems
- One core with multiple ecosystems
The saying goes, "Cross-end technology without dynamic capabilities lacks soul." We can also see that dynamic frameworks and cross-end frameworks overlap significantly. I previously summarized five implementation ideas for dynamic capabilities:
- WebView-based enhancements
- DSL-based native enhancements
- GPL-based native enhancements
- Plugin-based (Android)
- Utilizing OC runtime dynamic features (iOS)
I also drew a diagram to supplement the explanation:
Note: This diagram was drawn relatively early; in fact, the upper left corner can be replaced with "Flutter and other self-drawn frameworks."
Their core is essentially to load executable code during runtime and call it. It can be observed that the first three dynamic implementation ideas are identical to the technical directions we summarized for cross-end frameworks.
I believe the technical points include the following:
- Script engine
- Rendering engine
- Debugger
- Engineering solutions
Let’s introduce them one by one.
Script Engine#
The selection ideas for the script engine include the following three:
- JS engine: used only as a glue language, not heavily reliant on JIT.
- Dart VM: mainly to leverage the Flutter Engine for rendering, thus using the Dart ecosystem.
- WARM: requires designing DSL and implementing the rendering engine to complete the entire ecosystem.
If choosing a JS engine, there are also several selection ideas:
- Use dual engines: each end uses its own advantageous engine, Android uses V8, and introduces j2v8, while iOS uses JSCore without any package increment. However, unfortunately, directly using JSCore does not enable JIT.
- Use Hermes as a single engine: a script engine developed by Meta specifically for hybrid frameworks like React Native, ready to use out of the box.
- Use QuickJS as a single engine: a JS engine developed by a master, known for its small size and excellent performance.
- Use a self-developed JS engine: most in the industry are based on optimizations of QuickJS.
In summary, QuickJS currently has some issues:
- No JIT; this can be implemented on demand. While having JIT significantly improves execution efficiency, as a glue language, these aspects are not the primary concern. JIT can lead to increased cold start times, larger memory usage, and larger package sizes, and it cannot be used on iOS.
- Manual GC, which is difficult to manage and maintain, can be optimized.
- Missing line number records.
- Missing Debugger; currently, some open-source plugins on GitHub have implemented this.
- Lack of code cache.
- Lack of inline cache.
- Lack of memory leak detection capabilities.
- Bytecode has significant optimization potential.
Rendering Engine#
The selection ideas for the rendering engine are twofold:
- Based on Flutter Engine.
- Based on system graphics libraries, such as Skia / OpenGL / Metal / Vulkan.
Regardless of the basis, the overall framework idea is to streamline the pipeline and use synchronous rasterization.
Debugger#
A Debugger is an application that allows the JavaScript Runtime to interrupt and view the internal running state in real-time, serving as a tool for developers and is essential for frameworks.
Currently, there are three main debugging protocols, and the frameworks mentioned earlier have implemented at least one of them:
- CDP: Chrome DevTools Protocol.
- DAP: Debug Adapter Protocol.
- Self-built protocol: WeChat mini-programs initially used a self-built protocol.
Engineering Solutions#
Engineering solutions at least include the following work:
- Resource loading solutions.
- Degradation handling.
- Version management.
- Development modes.
Previously, Q Music's Kant had detailed designs and implementations in engineering, which I will not elaborate on here.
Conclusion#
Common issues faced by self-drawn frameworks and their solutions:
- Poor development experience: Use the front-end ecosystem, i.e., provide a JS Runtime; need to provide a Debugger; IDE needs to support language services.
- Poor documentation: Write good documentation.
- Insufficient CSS capabilities: Align with standards; if CSS is implemented on the Dart side, RenderObject needs to be made thick.
- Styles differ from H5: Accumulate test cases and validate with WPT.
- Inconsistencies between Android and iOS: Utilize existing resources, possibly based on Flutter.
- Too few components, lacking an ecosystem: Align with W3C standards, as complete as possible.
- Poor JS execution performance: Develop a self-researched JS engine.
- Not standardized, unable to reuse community libraries: Align with W3C standards, as complete as possible.
Some valuable lessons learned:
- Standards are paramount.
- Provide comprehensive documentation.
- Minimize self-research, make reasonable use of existing resources.
- Development experience is crucial.
- Pay attention to performance on low-end devices.
References and Further Reading#
- PPT (desensitized): https://weekly.ursb.me/slide/cross-end/
- Men Liu: "The Architecture and Practice of Taobao's Next-Generation Self-Drawn Rendering Engine" (2023 GMTC)
- Nie Xinxin: "Waft: A Development Framework for AIoT Applications Based on WebAssembly and Skia" (2023 GMTC)
- Shenghuai: "How WebF Achieves High Performance in Flutter + Web Integration" (2022 QCon)
- Ji Feng: "New Species in the Large Terminal Field - KUN"
- openwebf/WebF
- Airing: "Kant's Practice in 'QQ Music'" (not publicly available)
- Airing: "Flutter Dynamic Solutions" (not publicly available)