Sockets Capability, Direction and BSD


	links to this page:

Last updated at 3:45 pm UTC on 14 January 2006

From: Peter William Lount Sent: January 19, 2004 States his requirements

Is it possible to configure Squeak to listen to specific IP Addresses and Ports, i.e. 1.2.3.4:8000, on a machine that supports multiple IP addresses? If so how? If not, how can we add this feature? A further example is running a Comanche squeak server on 1.2.3.4:80 and on 5.6.7.8:80.

Two basic configurations interest me.
1. Restrict a Squeak image to a specific IP address (or more than one specific address) from the set of addresses that are being homed on the server.
2. Enable Squeak to listen to specific IP:port pairs from the set of addresses that are being homed on the server.

Peter William Lount Summarizes current capability
[Having heard from multiple folks and having done some research]
(1) Listening to Specific IP Addresses
Socket (and OldSocket) have a method that activates a primitive to listen on a specific IP Address and Port pair:

OldSocket >>primSocket: aHandle listenOn: portNumber backlogSize: backlog interface: ifAddr
Socket>>primSocket: aHandle listenOn: portNumber backlogSize: backlog interface: ifAddr

This seems to enable the listening of connections for "incoming" conversations providing the ability to run multiple instances of a service on the same port but with different ip addresses. i.e. 1.2.3.4:80 and 5.6.7.8:80. Excellent.

(2) Connecting From Specific IP Address (and Port)

Socket>>primSocket: socketID connectTo: hostAddress port: port

There doesn't seem to be a way to initiate an "outgoing" connection FROM a particular IP Address on a multi-homed computer. You can choose the address and port of the destination for the connection but you can't choose the from/source ip address (and port). The address likely defaults to the computer's default address. This is fine when there is one ip address but
when the computer has multiple ip addresses (i.e. multi-homed) it is a problem, especially when it's important to use particular ip addresses for particular purposes (i.e. logging, bandwidth tracking, web servers, etc...).

In a full socket protocol you can specify both "source ipaddress:port" and "destination ipaddress:port" pairs for outgoing connections. I propose that we extend the Squeak primitives (as follows) to support this full protocol.

"All selection of the outgoing ip address to use. Allow system to choose outgoing port."

Socket>>
    primSocket: socketID
    connectTo: destinationAddress port: destinationPort
    connectFrom: sourceAddress

Obviously the source address MUST be one of the ip addresses that the local computer is currently assigned. As such this method really only makes sense on multi-homed computers. This shouldn't allow "spoofing" of the sending ip address.

"Allow selection of the outgoing ip address to use. In addition request the outgoing port to use."

Socket>>
    primSocket: socketID
    connectTo: destinationAddress port: destinationPort
    connectFrom: sourceAddress port: sourcePort

Generally you don't care which outgoing port is used (as it might be in use already) so the first method is used more often.

Obviously we'd need the non-primitive methods as well. Does any version of squeak have these or equivalent primitives?

What do you think? I definitely need to have the ability to choose the outgoing ip address since they are assigned for particular purposes.

Andreas Raab Begins the discussion about future direction and BSD
Someone with a bit more socket knowledge should chime in here but it seems to me that this almost cries for a primitive doing the equivalent to BSD's bind() call - e.g., assign a local address and port to a socket. So wouldn't it be easier just to expose something like:

Socket>>bindTo: localAddress port: localPoirt

rather than trying to put all of this information in all of the varying places?

Ian Piumarta
This [a BSD type port] is precisely what

aSock listenOnPort: aPort backlogSize: nConn interface: anAddr

does. Cutting out the irrelevant implementation noise, the body of the prim contains

   saddr.port = aPort;
   saddr.addr = anAddr;
   bind(aSock, saddr);
  listen(aSock, nConn);

(nothing more, nothing less). IOW, the only difference between the prim and bind() is that the prim goes on to perform the listen() as well, since you're pretty much guaranteed to want to do that after binding the address.

FWIW, the only difference between that and the "old" listenOn: primitive is that the old one uses INADDR_ANY for the interface address (which causes the socket to be bound to the given port on all available local interfaces).

I concur. It would be best to mirror the API of the major operating systems network stacks. BSD being the best. ;–) The UNIX Squeak (version Squeak3.6g-5420 with updates) does not yet have =Socket>>primSocketListenOn:backlog:interface:=.

The latest version on http://www-sor.inria.fr/~piumarta/squeak/index.html is Squeak3.6g-5420.image. Is there a newer version with this new primitive? If so, where can I get the latest version of the UNIX port or the C code updates for it? Or when is a new UNIX release coming?

Andreas Raab
Yes, I understand [what aSock listenOnPort:backlogSize:interface: does] - I was more thinking along the lines of actually simplifying the listen primitive(s) so that the ST side would look along the lines of:

listenOn: port backlogSize: count interface: ipAddr
   self bindTo: ipAddr port: port.
  self listen: backlogSize.

etc. So #bindTo:port: would handle all the aspects that are now done in the primitive and listen primitive would just listen on the locally assigned address - which means that the socket prims actually model the underlying BSD socket semantics much more closely, are easier to implement and probably a bit easier to use. [Ian concurred]

[It might be useful to do a bind() without a listen()] if you want to use a specific local interface in a connect() call (that was the starting point for the discussion). In general I think it's advantageous to model the Squeak primitives to closely resemble the BSD sockets interface just on the general grounds of simplifying the VM implementation. So that

#primitiveConnect has the semantics of connect()
#primitiveListen has the semantics of listen()
#primitiveBind has the semantics of bind()

etc. [Ian concurred]

Ian Piumarta
So, you want to do this? I suggest adding one "old-style" primitive in addition (which, along with several older prims, might be culled at some later deadline if backward compatibility goes out the window). As above, plus:

   primConnectTo:interface:

which does primConnect: with an explicit bind() on the socket first. (The reasoning being that it's one method in the image to make it available, keeping everything consistent and compatible with "the old way". From there, given the lower-level prims, you could swap out the guts of Socket to follow a strict BSD model any time you felt like hacking on the image a little. Then if you decide give up on older images, throw the old prims away...)

Andreas Raab
For bind(), connect() and listen() I wouldn't expect [having to rewrite a bunch of really stale and ugly Mac prims to cope] to be a problem. All a bind primitive implementation would do is to remember the address+port for the socket. Then, upon connect or listen primitive it may decide to use only an address that's equivalent with localHostAddress and otherwise fail. This behavior would be precisely equivalent with that of a machine with a single interface and not require much extra effort.

Given the other discussion with David, how about starting to work on a SocketPlugin v2 which contains these as well as some of the stuff David was proposing? E.g., this could then also support IPv6, interface enumeration etc. etc. etc.

John McIntosh
[having to rewrite a bunch of really stale and ugly Mac prims to cope]is not a problem the old os-9 open transport code currently does as an example.

             OTInitInetAddress(&epi->localAddress, 0, kOTAnyInetAddress);

which is part of the DoBind() procedure which binds the connection (like a socket) to the remote and local sides.

If you pass in the port, and the interface address we'll use that instead of the 0, and kOTAnyInetAddress {meaning any interface address}

Lex Spoon
There are two issues: compatibility to old images, and compatibility to MacOS classic. Old images use the old primitives, including the strange listenOn: primitive which closes the listening socket after the first connection. Granted, it has been a few years at this point. As a bigger issue, there was a problem using BSD sockets on MacOS classic. That's why Squeak has its distinctive sockets API to begin with... But I would guess neither of these is a big deal.

That aside, let's goes whole hog! Make a real BSDSocket class that has all the functions BSD sockets are supposed to have. To get an idea of what is required, download Scheme Shell and look what they did; I have heard they have a quite thorough wrapper for BSD sockets into Scheme, plus some nice utilities sitting on top of them.
http://www.scsh.net
Keep in mind that there are probably a lot of odds and ends that the current primitives overlook. For example, there is out of band data that would be useful for a proper telnet client. There are also various flags in some of the functions that we are probably overlooking.

Anyway, I am not volunteering any effort here, so do as ye will. :) It just worries me to see this endless tinkering. Either we want super-portable primitives like the original Squeak primitives, or we want to insist on full BSD semantics. Playing around in the between
spaces seems strange.

John McIntosh
[It]'s not quite true [that the reason why Squeak has its distinctive sockets API is that there was a problem using BSD sockets on MacOS classic.], since May of 2000 the classic Mac VM has used Open Transport which supports the listen properly. I think at Smalltalk Solutions a few years back there was interest in doing a BSD socket like layer...

Andreas Raab
Well, I don't think that's actually true - IIRC, then the reason for the socket interface was that the BSD sockets weren't the default on classic MacOS (I think it used to be third party). And since classic MacOS had a few very specific idiosynchracies, the interface reflects some of them to the extend that one would expect.

I tried [looking at Scheme] but the one thing I couldn't find and the only one thing that
I'd be really interested in is how they deal with the equivalent to select() and friends. That's probably the only place where it gets interesting.

. . .Realistically, I don't care too much about [the various flags and odds and ends]. Quite to the contrary to be honest - my feeling is that various of the options we specify today can (and should) be set via options instead of having them inside some primitive or other. For example, it seems that the send and receive buffer size should be specified that way and looking at primSocket:setPort: just makes me want to puke.

"This endless tinkering" is supposed to come up with a balanced POV between what primitives are supposed to do vs. what the Smalltalk code needs to deal with and how it affects compatibility issues. I wouldn't see the distinction as hard as you make it here - to me it seems quite reasonable to discuss these issues and get an understanding of how we would want to deal with them at minimal cost. Given what has been said sofar, it seems that we may get away with relatively few changes which effectively comes down to providing an interface to bind() as well as a "pure" interface to listen() (e.g., one which doesn't take a port) but this could be easily "simulated" by specifying port zero. Exposing bind() addresses the issues at hand and would - a little further down the road - allow us to address some of the other issues.