Except it isn't. Code isn't one single pattern repeating again and again; on large enough bodies of code, RISC-V is the most dense, and it's not even close.
Decades of demoscene productions beg to differ. That just means compilers are awful, as they usually are.[1] x86 has far more optimisation opportunities than any RISC.
RVA23 is, finally, the belated admission that maybe we shouldn't have everything as optional extras. Hopefully it'll take off, I can't imagine what sort of a headache it is for maintainers of repos who have to track a dozen different variants of binaries depending on which flavour of RISC-V the apt-get is coming from.
RVA23 (and RVA20 before it) aren't an admission that Risc-V got it wrong. It's a necessary step to make Risc-V competetive in the desktop space as opposed to micro-controllers where the flexibility is hugely valuable.
Well, for different reasons, but you have similar issues with IPv6 as well. If your client uses temporary addresses (most likely since they're enabled by default on most OS), OpenSSH will pick one of them over the stable address and when they're rotated the connection breaks.
For some reason, OpenSSH devs refuse to fix this issue, so I have to patch it myself:
--- a/sshconnect.c
+++ b/sshconnect.c
@@ -26,6 +26,7 @@
#include <net/if.h>
#include <netinet/in.h>
#include <arpa/inet.h>
+#include <linux/ipv6.h>
#include <ctype.h>
#include <errno.h>
@@ -370,6 +371,11 @@ ssh_create_socket(struct addrinfo *ai)
if (options.ip_qos_interactive != INT_MAX)
set_sock_tos(sock, options.ip_qos_interactive);
+ if (ai->ai_family == AF_INET6 && options.bind_address == NULL) {
+ int val = IPV6_PREFER_SRC_PUBLIC;
+ setsockopt(sock, IPPROTO_IPV6, IPV6_ADDR_PREFERENCES, &val, sizeof(val));
+ }
+
/* Bind the socket to an alternative local IP address */
if (options.bind_address == NULL && options.bind_interface == NULL)
return sock;
I'm not sure what happens to the socket, maybe it's closed and reopened, but with this patch I have SSH sessions lasting for days with no issues. Without it, even roaming between two access points can break the session.
It would also seem to break address privacy (usually not much of a concern if you authenticate yourself via SSH anyway, but still, it leaks your Ethernet or Wi-Fi interface's MAC address in many older setups).
Not anonymous, but it's pretty unexpected for different servers with potentially different identities for each to learn your MAC address (if you're using the default EUI-64 method for SLAAC).
This is a very common misconception. The issue is not IPv4 or CGNAT, it's stateful middleboxes... of which IPv6 has plenty.
The largest IPv6 deployments in the world are mobile carriers, which are full of stateful firewalls, DPI, and mid-path translation. The difference is that when connections drop it gets blamed on the wireless rather than the network infrastructure.
Also, fun fact: net.ipv4.tcp_keepalive_* applies to IPv6 too. The "ipv4" is just a naming artifact.
Mobile carriers usually have stateful firewalls for IPv6 as well (otherwise you can get a lot of random noise on the air interface, draining both your battery and data plan), so it's an issue just the same.
The constrained resource there is only firewall-side memory, though, as opposed to that plus (IP, port) tuples for CG-NAT.
Or my predecessor/address space neighbor, or that of somebody using my wireless hotspot once, or that of me clicking a random link once and connecting to 671 affiliated advertisers's analytics servers...
I think a default policy of "no inbound connections" does makes sense for most mobile users. It should obviously be configurable.
Fast, RVA23-compatible microarchitectures already exist. Everything high performance seems to be based on RVA23, which is the current application profile and comparable to ARMv9 and x86-64v4.
However, it takes time from microarchitecture to chips, and from chips to products on shelves.
The very first RVA23-compatible chips to show up will likely be the spacemiT K3 SoC, due in development boards April (i.e. next month).
More of them, more performant, such as a development board with the Tenstorrent Ascalon CPU in the form of the Atlantis SoC, which was tapped out recently, are coming this summer.
It is even possible such designs will show up in products aimed at the general public within the present year.
To the best of my knowledge (and Google-fu), 26K really isn't a lot of transistors for an embedded MCU - at least not a fully-featured 32-bit one comparable to a minimal RISC-V core. An ARM Cortex M0, which is pretty much the smallest thing out there, is around 10K gates => around 40K transistors. This is also around the same size as a minimal RISC-V core AFAICT.
There's reason RV32E and RV64E, with half the registers, are a thing. RV32I/RV64I isn't small enough.
There are many chips in the market that do embed 8051s for janitorial tasks, because it is small and not legally encumbered. Some chips have several non-exposed tiny embedded CPUs within.
RISC-V is replacing many of these, bringing modern tooling. There's even open source designs like SERV that fit in a corner of an already small FPGA, leaving room for other purposes.
Per https://en.wikipedia.org/wiki/Transistor_count, even an 8051 has 50K transistors, which reinforces my claim that 26K really doesn't seem like a big ask for an MCU core. Whether that means a barrel shifter is worth it or not is a totally orthogonal question, of course.
(Although I do have to eat my words here - I didn't check that Wikipedia page, and it does actually list a ~6K RISC-V core! It's an experimental academic prototype "made from a two-dimensional material [...] crafted from molybdenum disulfide"; I don't know if that construction might allow for a more efficient transistor count and it's totally impractical - 1KHz clock speed, 1-bit ALU, etc. - for almost any purpose, but it is technically a RISC-V implementation significantly smaller than 26K)
I don't know if that construction might allow for a more efficient transistor count and it's totally impractical - 1KHz clock speed, 1-bit ALU, etc. - for almost any purpose, but it is technically a RISC-V implementation significantly smaller than 26K
That sounds like a microcoded RISC-V implementation, which can really be done for any ISA at the extreme expense of speed.
If I'm not mistaken, microcode is a thing at least on Intel CPU's, and that is how they patched Spectre, Meltdown and other vulnerabilities – Intel released a microcode update that BIOS applies at the cold start and hot patches the CPU.
Maybe other CPU's have it as well, though I do not have enough information on that.
> There's reason RV32E and RV64E, with half the registers, are a thing. RV32I/RV64I isn't small enough.
This is actually kind of counter to your point. The really tiny micro-controllers from the 80s only had 224 bits of registers. RV32E is at least twice that (16 registers*32 bits), and modern mcus generally use 2-4kbs of sram, so the overhead of a 32 bit barrel shifter is pretty minimal.
>but it's a lot more instructions so it won't be used in practice.
It will be used when it needs to be handled. e.g. where elsewhere, an exception would actually handle it. Which is seldom the case.
More instructions doesn't mean slower, either. Superscalar machines have a hard time keeping themselves busy, and this is an easily parallelizable task.
>The designers of RISC-V included the bare minimum needed to compile C, everything else was deemed irrelevant.
Refer to "Computer Architecture: A Quantitative Approach" by by John L. Hennessy and David A. Patterson, for the actual methodology followed.
Except it isn't. Code isn't one single pattern repeating again and again; on large enough bodies of code, RISC-V is the most dense, and it's not even close.
reply