ember doesn’t seem load in safari - i had the same problem.
I guess my question is: how big of a deal is this to actual Android users?
You’re comparing raw iOS numbers to Android numbers, and sure, they can be a little jarring.
But Android users aren’t sitting there holding an iPhone in one hand and their own phone in another. In other words, instead of comparing Android numbers to iOS numbers, maybe try comparing “Discourse on Android” numbers to “Other sites on Android” numbers. And not just numbers, but usability, etc.
I have a Galaxy Nexus, and only-okay internet usage comes with the territory of having an old device. Most websites are pretty slow. But I don’t also have an iPhone, so in a way I don’t know any better.
Discourse might be 5 times slower on Android compared to iOS, but it’s about the same speed (?) as other websites on the same device, and I think that’s the comparison most people actually care about. Apples and oranges, and all that.
I won’t say you’re making a mountain out of a molehill, but I do think maybe the people who live on the mountain don’t really mind that it takes them a little longer to get to the nearest Chipotle. Instead of worrying that people in the city can get to a Chipotle in 10 minutes instead of the hour it takes the mountain people, compare the time it takes the mountain people to get to the nearest McDonalds, because that’s the only comparison that matters to the mountain people.
Android users as mountain folk and Discourse as a Chipotle. I think that metaphor got away from me.
Edit: Whoops, I guess this is pretty much what @benfreke said. So, uh, what he said.
Just scored 236ms on my Note 5 using Next Browser. Looks like hardware isn’t the most important factor here.
Qualcomm Snapdragon series, Samsung Exynos series, I suppose Nvidia Tegra and standard ARM for the companies without an architecture license. That’s not that many, unless I misunderstood or I’m missing something. I just don’t think Google is trying all that hard to get Chrome high and tight. They seem to be depending on yearly faster hardware covering up the issues until they are fast enough for people to not complain.
with configurations i was referring to the fact that apple has a7,a7x a8,a8x i think and a few combinations with an extra core. But their designs are VERY homogeneous .
The situation with android (cpus):
They vary from 1 to 8 cores. Some designs are 2 big 2 small, some are 4 big, some are 4 small (ish).
That’s spanning several archictetures and bit sizes. They presumably require different optimizations - and google will have a hell of a hard time optimizing for all of them.
That’s not even getting into screen sizes, ram amount, storage speeds etc…
There are Several orders of magnitude more android configurations than there are iOS configurations.
I wonder what the benchmark looks like running in Intel’s Crosswalk. When I was attempting to deploy IonicFramework apps, it greatly improved the performance and consistency of our app. It made scrolling almost bearable.
That’s a bit of backwards reasoning. With slow cores, and numerous ones, vs few high performing cores, you expect the same performance. Numerous slow cores will always be slower here. Just the work to switch from core to core, the way these processors are designed, wastes too many cycles.
Then there is the problem of four faster, more sophisticated cores, with four slower, less sophisticated cores. The processors were not designed for all of them to be working on the same problem at once. It’s a fundamental error in usage.
It is a combination of software and hardware. Based on the geekbench score, a galaxy s6 should be able to do close to as well as an iPhone 5s which has very similar geekbench single core score. You can see this in the AnandTech results I screenshotted when comparing the native browser and Chrome.
@tgxworld the benchmark runs fine in my very stock iPhone 6s under Mobile Safari. Maybe hard refresh?
I’m not all that sure it matters. How long before phones get 10x faster? How long before they’re 100x faster? Just limit your expectations, find new ways to make things seem more responsive. You are talking about devices a hundred, if not a thousand times more performant than the ones we’ve been happily playing Doom and Quake on. They might not be as quick as you’d like, but they are not slow.
Slow is a relative term, so yeah compared to the PCs that were around when Doom and Quake came out no phones are not slow but that comparison is a fallacy. The experience on an android phone is fundamentally slow compared to other platforms, noticeably in fact.
Surely designing and validating the CPU cores is meaningfully more expensive (and time consuming) than working with existing ARM IP?
Fewer faster cores is better for most single-user apps and OSs, not just JS engines, and when its not, GPUs are often a better option. So, the question then is why is Android in this sorry state, and what might the future hold?
How Did We Get Into This Mess?
I think that this is a result of an interplay between ARM’s business strategy, the strategy of SoC makers, and the marketing needs of Android phone vendors.
Android phone vendors are already at a disadvantage when competing with Apple because most lack the scale and integration of Apple, and therefore can’t control as much of the user experience. This situation lends itself to competing on specs.
Core-count and clockspeed are both numbers that can be flogged by phone makers against Apple, and against one another. Unfortunately, pushing clockspeed too far comes at the expense of an even more important spec: battery life.
The SoC makers need to deliver “improvements” at a suitable pace for the phone makers. Because of there are multiple players in the android market, a suitable pace is multiple times a year. Unfortunately for the SoC makers there are forces outside their control that limit the pace of meaningful improvements. As a result, they have resorted to gimmicks, like large core counts.
The forces outside the control of SoC makers are manifold. Perhaps the most fundamental is the pace of improvements in semiconductor fabrication technology. Moores “law” has had a good long run, and over that run, the economic and technological forces involved have ended up synchronizing around delivering new process nodes approximately every two years. In between, the main opportunity for performance improvements comes from architectural changes.
This brings us to the other force outside the control of most ARM SoC makers: They rely on ARM for CPU core designs. Qualcomm and NVIDIA are major exceptions. Both have their own CPU cores, but even so, they still reply on ARM for the CPU cores in many of their SoCs. For Qualcomm, this is apparently the result of having misjudged how aggressively Apple would pursue it’s own core designs. This strategic misstep left them without any high-end 64-bit offering for a time, and resulted in them turning to an ARM core design while they finish their own in-house 64-bit ARM core. For NVidia, it seems to be more the slow pace of their own program to develop in-house ARM cores.
Developing their own ARM cores is an expensive and long-running commitment. They have to acquire the appropriate license from ARM, hiring suitable staff, and commit to the long process of developing and validating an in-house CPU core design. The challenge of hiring should not be underestimated. It is pretty clear that the number of people who can be trusted in key roles in a program to develop a cutting-edge CPU core are quite limited; just witness the way certain key people have shifted around from Apple, AMD, etc.
I don’t think that people can expect ARM itself to develop a competitive high-performance core, because that undermines their relationship with architectural licensees, like Apple, Qualcomm and NVidia. The risk of alienating Apple is particularly high, I think, because Apple, which has core design, SoC design, device design, OS, developer tools, and application distribution under its control, can move to another instruction set architecture if need be. As a result, the core designs that can be licensed from ARM are likely to fall short of being leading edge.
What Might the Future Hold
There will be new Android devices released, with new SoCs. There is no guarantee that they will have a marked improvement in single core performance, but it seems likely that, at some point soon, more cores will no longer be viable or acceptable, and that raw single-core performance will gradually improve. It will be years though before there are meaningful improvements in the raw single-core performance of the average Android device in the field.
In the longer term, there is also some chance that raw single performance in new devices makes a significant leap at some point when/if Qualcomm gets back in the swing of doing its own cores.
In the longer term, there is another possibility for Android devices with significantly improved single-core performance, non-ARM SoCs. I haven’t been keeping track of Intel’s efforts, but I know their mobile chips have good single-threaded performance, and I think they probably need to get a good foothold mobile market, or die trying. And then there is Imagination, which licenses the GPU cores used by Apple and other high-end SoC makers, and who recently bought MIPS.
One thing worth considering, that I don’t have a great sense of is the impact of demographic trends. If we assume that globally over, say, the next 4 years, Android will hold or gain market share against iOS because of cost considerations, what does that mean for the performance of mainstream android devices? Will most of those low-cost devices use old, lame SoCs and therefore lag the progress I predict/hope must come to Android single core performance, or will they at least see the same pace of performance improvements, even if they start from a relative disadvantage.
tl,dr, What Does this Mean for Discourse
As for Discourse, it seems to me that your assumptions should be that:
- JS performance on mainstream Android devices will always lag that in iOS mainstream by a significant degree for at least the next 3 years.
- JS performance in the android mainstream will gradually improve in absolute terms
- JS performance in the android mainstream may improve relative to iOS in the 1-3 year timeframe.
Recommendations based on the above:
- Inform end-users of the problems in order to create incentives for browser and hardware makers to address the problem
- Based on usage stats, figure out the minimum current hardware/software you want to target.
- Identify and prioritize application-level performance problems that have the greatest impact on user experience for users in your target
- Fix those worth fixing.
- When in doubt, target performance improvements that yield gains on both android and iOS, followed by those that can be tuned based on platform / performance, etc (for example, the way you currently halve the amount of data sent to android clients)
- When possible, address issues by encouraging/supporting efforts to address them in libraries/frameworks you use (ie Ember).
I don’t think that the number of different devices is that big a deal as far as browsers go. x86 isn’t new, and ARM is ARM outside of the manufacturers doing their own designs (based on ARM still). If you sell cases, yes, the number of different Android devices would annoy you if you expect to cover every device. Ram amount shouldn’t make a difference, speed of storage shouldn’t make a difference, outside of some contrived and unlikely edge scenarios like a 3rd rate AOSP manufacturer breaking libraries and using janky memory or something with a 2-in low-res 16x8 display. Screen sizes have not been an issue unless you are a particularly stupid developer using defined individually pixel-mapped stuff for your universal Android app rather than following all the guidance and stubbornly using Apple techniques from the pre-iPad, pre-iPhone 5 days and expecting them to carry-over for some reason.
For all the benchmarks that have Apple abso-slaying everything ever, in practice their devices aren’t particularly faster in a very noticeable way compared to same-gen Android. In many cases, their animations make things take longer than they need to and they use image placeholders to cover-up slowness. I feel that there is a lot more to the story.
This is definitely what the Google folks reported on the bug that was filed on V8:
I updated v8:3664 with a prototype fix, which reduces number of deoptimizations caused by weak references. To see noticeable performance improvement we need to also fix v8:3663, otherwise there are many deoptimizations caused by high degree of polymorphism. Not clearing ICs would allow generated code to reach the slow generic case faster and thus avoid deoptimizations. This however is a big project (several months) as it requires support for weak references in polymorphic ICs to avoid memory leaks.
They made some incremental improvements in Chrome 44 and beyond, which definitely helped, as you can see from my Chrome benchmarks on a Nexus 9:
Chrome 44 (beta) Render Complex List | 2.78 | 1421 | 34 | 359.85 Chrome 45 (dev) Render Complex List | 3.32 | 1545.03 | 40 | 301.65
About a 20% improvement, which is not chopped liver, but when you are starting in a 500% hole… I don’t mean to fault the Google folks, they have done some good work in this area, but we need further improvements.
Was just reading this Anandtech review of the iPhone 6s… the numbers are crazy.
How I wish Android worked the same way…
I cannot repro this. On my Nexus 7 (2013) I get 974ms in Chrome 46 beta, and 1067ms in Next Browser. Are you sure you are running the benchmark exactly as specified in the first post of this topic?
Your numbers are quite suspect since they show the Note 5 as faster than the Nexus 9 which does 300ms in this test, and is also the fastest geekbench android processor according to Geekbench single core results.
Would a framework or app architecture that made better use of Web Workers be feasible for Discourse? Is there a suitable Worker-driven design that would address the major performance concerns experienced here?
I realise this introduces new problems (concurrency, marshalling and transferring data between Workers, etc), but is it conceivable that such an approach could make better use of multiple CPUs and reduce the dependancy upon single-core performance?