NCNN vs ONNX vs Tensorflow vs PyTorch

geekynerd · Jul 1, 2026

i have a model compiled in all this framework which is suitable for connecting with flutter app. it must run on the device in which it is installed. it is a lightweight model which even worked on rasberry pi but need to choose one framework to connect the model with the app

Filip Larsen · Jul 1, 2026

Which frameworks have you tried so far and where are you stuck? (You post sort of reads like a chatbot prompt, but PF tend to give better answers when prompted with direct and specific questions.

)

geekynerd · Jul 1, 2026

I compiled the model in all the framework but tbh I found only pytorch official framework to connect with the flutter app and I tried it. In snapdragon 7 gen 3 phone I am attaining 2fps. But the model ran better on raspberry Pi and I got around 20 fps. (Btw this is a yolo v11 model which is trained on images to recognise some features). So I thought of like why compile my own package to connect flutter with model where I can attain some decent fps. I don't have time to make package for each framework so I am taking suggestion on which framework to use

Filip Larsen · Jul 2, 2026

In general (as you probably know), getting high enough frame rates on low end devices can be very difficult. Most framework that support prediction on CPU typically runs significantly slower than on GPU/NPU. I have no experience trying to tune for such devices (I work with ONNX models on high-end GPU devices), but I understand a common trick is to try reduce image size as much as possible (perhaps via some clever preprocessing) until prediction accuracy starts to deteriorate below your set limit. You may also want to ensure that your network does not use operations that are particularly expensive on CPU. Some frameworks are able do this analysis and possible optimization for you when targeting a specific hardware so check if that is an option for you.

geekynerd · Jul 2, 2026

Filip Larsen said:

In general (as you probably know), getting high enough frame rates on low end devices can be very difficult. Most framework that support prediction on CPU typically runs significantly slower than on GPU/NPU. I have no experience trying to tune for such devices (I work with ONNX models on high-end GPU devices), but I understand a common trick is to try reduce image size as much as possible (perhaps via some clever preprocessing) until prediction accuracy starts to deteriorate below your set limit. You may also want to ensure that your network does not use operations that are particularly expensive on CPU. Some frameworks are able do this analysis and possible optimization for you when targeting a specific hardware so check if that is an option for you.

In onnx.framework I have 3 models 640*640, 416*416 and 256*256 and like I preprocess the live feed to 30 frames and directly feeding it. I trained it with like raw images so changing the model input in the name of preprocessing reduced the accuracy (I feel that might happen). If you have any onnx packages to connect with flutter or any general packages which you think might be useful for me please share it to me

Filip Larsen · Jul 2, 2026

geekynerd said:

If you have any onnx packages to connect with flutter or any general packages which you think might be useful for me please share it to me

Sorry, I have no experience with Flutter and so far only used ONNX via their DML EP (and CPU for testing purposes) on high end PC's.

If you have 30 FPS input given but the fastest network you have with desired or minimum detection rate is only able to guarantee X FPS, then perhaps you can simply forward only every trunc(30/X) frame in the input to the network. From what I understant I would think it to be almost theoretical impossible to do a sustained 30 FPS large image inference of even the simplest of networks on a mobile CPU (I mean, there is a reason phone manufacturers want to push for new NPU hardware on mobile platforms) so with that in mind you most surely have to reduce the load somehow, and down-sampling the input stream in either time and/or size is (as far as I know) really the only robust solution (even if its not optimal).

I am not familiar with the YOLOv11 network, but I understand from its description that it already very much is designed for near real-time detection, so I doubt you can gain detection speed applying structure changes yourself. With your given fixed input rate, CPU hardware and YOLO network I don't see any other option than reducing network load as mentioned above.

geekynerd · Jul 2, 2026

Filip Larsen said:

Sorry, I have no experience with Flutter and so far only used ONNX via their DML EP (and CPU for testing purposes).

If you have 30 FPS input given but the fastest network you have with desired or minimum detection rate is only able to guarantee X FPS, then perhaps you can simply forward only every trunc(30/X) frame in the input to the network. From what I understant I would think it to be almost theoretical impossible to do a sustained 30 FPS large image inference of even the simplest of networks on a mobile CPU (I mean, there is a reason phone manufacturers want to push for new NPU hardware on mobile platforms) so with that in mind you most surely have to reduce the load somehow, and down-sampling the input stream in either time and/or size is (as far as I know) really the only robust solution (even if its not optimal).

I am not familiar with the YOLOv11 network, but I understand from its description that it already very much is designed for near real-time detection, so I doubt you can gain detection speed applying structure changes yourself. With your given fixed input rate, CPU hardware and YOLO network I don't see any other option than reducing network load as mentioned above.

Out of 30 if I get like 20 frames on a mid range mobile I am more than satisfied but 2 frames are not acceptable.

I have never worked with ai. Sometimes I work with transformer pipeline for very small project. I am more of a cloud engineer like deploy and maintenance is my field but the place I do intern they asked me to do this.

So any material related to model connection with the flutter is good

I have like no codebase of model development. They just give me the final compiled files. So any documentation related to this will help me

NCNN vs ONNX vs Tensorflow vs PyTorch

Is A.I. more than the sum of its parts?

AI vs. Humans as Processors in an Environment

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

NCNN vs ONNX vs Tensorflow vs PyTorch

Similar threads