aea0cd04 glebius Aug. 17, 2022, 6:50 p.m.
Should have been done in 89128ff3e42.
cgit
a6b982e2 glebius Aug. 17, 2022, 6:50 p.m.
81a34d37 glebius Aug. 17, 2022, 6:50 p.m.
The method was called for two different conditions: 1) the VM layer is
low on pages or 2) one of UMA zones of mbuf allocator exhausted.
This change 2) into a new event handler, but all affected network
subsystems modified to subscribe to both, so this change shall not
bring functional changes under different low memory situations.

There were three subsystems still using pr_drain: TCP, SCTP and frag6.
The latter had its protosw entry for the only reason to register its
pr_drain method.

Reviewed by:		tuexen, melifaro
Differential revision:	https://reviews.freebsd.org/D36164
cgit
1922eb3e glebius Aug. 17, 2022, 6:50 p.m.
They were useful many years ago, when the callwheel was not efficient,
and the kernel tried to have as little callout entries scheduled as
possible.

Reviewed by:		tuexen, melifaro
Differential revision:	https://reviews.freebsd.org/D36163
cgit
a0d7d247 glebius Aug. 17, 2022, 6:50 p.m.
Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36162
cgit
b730de8b glebius Aug. 17, 2022, 6:50 p.m.
While here remove recursive network epoch entry in mld_fasttimo_vnet(),
as this function is already in epoch.

Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36161
cgit
0ce4d7ec glebius Aug. 17, 2022, 6:50 p.m.
Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36160
cgit
6c452841 glebius Aug. 17, 2022, 6:50 p.m.
Modern TCP stacks uses multiple callouts per tcpcb, and a global
callout is ancient artifact.  However it is still used to garbage
collect compressed timewait entries.

Reviewed by:		melifaro, tuexen
Differential revision:	https://reviews.freebsd.org/D36159
cgit
160f01f0 glebius Aug. 17, 2022, 6:50 p.m.
Reviewed by:		melifaro
Differential revision:	https://reviews.freebsd.org/D36236
cgit
78b1fc05 glebius Aug. 17, 2022, 6:50 p.m.
The protosw KPI historically has implemented two quite orthogonal
things: protocols that implement a certain kind of socket, and
protocols that are IPv4/IPv6 protocol.  These two things do not
make one-to-one correspondence. The pr_input and pr_ctlinput methods
were utilized only in IP protocols.  This strange duality required
IP protocols that doesn't have a socket to declare protosw, e.g.
carp(4).  On the other hand developers of socket protocols thought
that they need to define pr_input/pr_ctlinput always, which lead to
strange dead code, e.g. div_input() or sdp_ctlinput().

With this change pr_input and pr_ctlinput as part of protosw disappear
and IPv4/IPv6 get their private single level protocol switch table
ip_protox[] and ip6_protox[] respectively, pointing at array of
ipproto_input_t functions.  The pr_ctlinput that was used for
control input coming from the network (ICMP, ICMPv6) is now represented
by ip_ctlprotox[] and ip6_ctlprotox[].

ipproto_register() becomes the only official way to register in the
table.  Those protocols that were always static and unlikely anybody
is interested in making them loadable, are now registered by ip_init(),
ip6_init().  An IP protocol that considers itself unloadable shall
register itself within its own private SYSINIT().

Reviewed by:		tuexen, melifaro
Differential revision:	https://reviews.freebsd.org/D36157
cgit
ef8b8723 imp Aug. 17, 2022, 5:39 p.m.
Move the mbr non-geli zfs cases to no-priv creation with makefs / mkimg.
Add comments about the weird thing we do for MBR + ZFS + Legacy.  Add
comments about other architectures. Still need to think through how to
leverage a completed universe to do all the architectures...

Sponsored by:		Netflix
cgit
f7473c7e schweikh Aug. 17, 2022, 5:23 p.m.
3c405c7e schweikh Aug. 17, 2022, 5:13 p.m.
fa46f370 jhb Aug. 17, 2022, 5:01 p.m.
Certain operations such as checksum insertion and VLAN insertion
require the device model to rewrite the packet header.  The first step
in rewriting the packet header is to copy the existing packet header
from the source packet.  This copy is done by copying data from an
iovec array that corresponds to the S/G entries described by transmit
descriptors.  However, if the total packet length is smaller than the
headers that need to be copied as the initial template, this copy can
overflow the iovec array and use garbage values as the source pointer
to memcpy.  The PR used a single descriptor with a length of 0 in its
PoC.

To fix, track the total packet length and drop requests to transmit
packets whose payload is smaller than the required header length.

While here, fix another issue where the final descriptor could have an
invalid length (too short) that could underflow 'len' when stripping
the checksum.  Skip those requests instead, too.

PR:		264372
Reported by:	Robert Morris <rtm@lcs.mit.edu>
Reviewed by:	grehan, markj
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D36182
cgit
e7439f6a jhb Aug. 17, 2022, 5 p.m.
This avoids type confusion where a malicious guest could rewrite the
MaxPStreams field in an endpoint context after the endpoint was
initialized causing the device model to interpret a guest provided
address (stored in ep_ringaddr of the "software" endpoint state) as a
bhyve host process address (ep_sctx_trbs).  It also prevents a malicious
guest from triggering overflows of ep_sctx_trbs[] by increasing the
number of streams after the endpoint has been initialized.

Rather than re-reading the MaxPStreams value out of the endpoint context
in guest memory on subsequent operations, cache the value in the software
endpoint state.  Possibly the device model should raise errors if the
value of MaxPStreams changes while an endpoint is running.  This approach
simply ignores any such changes by the guest.

PR:		264294, 264347
Reported by:	Robert Morris <rtm@lcs.mit.edu>
Reviewed by:	markj
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D36181
cgit