TCP/IP Protocols for Plug-and-Play Industrial Automation

February 24, 2016/10 Comments/in Physics Articles/by Svein

Read Time: 11 minutes

Readability: Moderate (Standard complexity)

Core Topics: IPProtocolRFCnetworkaddress

Background: Ethernet vs Fieldbus in the 1990s

The 1990s saw the widespread adoption of network technology on two fronts: Office automation and factory automation. But while office automation soon standardized on Ethernet (for volume reasons) and TCP/IP (due to the Internet explosion), factory automation fragmented into several incompatible field busses (a Fieldbus is a common designation for an automation network). One decade later, the consequences materialize – networks based on Ethernet and TCP/IP are steadily getting faster and less expensive, while the Fieldbus market is more or less stuck at the low speed/high price level.

Article purpose and scope

Due to this development, several Fieldbus vendors have ported their system software to run on top of Ethernet and TCP/IP to be part of the continuing trend towards higher performance, lower prices, and standardized protocols. What they have failed to see, however, is that TCP/IP is not one protocol, but a large set of protocols aimed at everything from installation and configuration to network management. The purpose of this article is to shed some light on these additional protocols and show how they could be used to pave the way for “plug-and-play” automation networks.

TCP/IP Basics – IETF and RFC

The generic term “TCP/IP” usually means anything and everything related to the specific protocols of TCP and IP. It can include other protocols, applications, and even the network medium. A sample of these protocols is UDP, ARP, and ICMP. A sample of applications is TELNET, FTP, and RCP. TCP/IP runs equally well on top of Ethernet, ATM, and Wireless LAN, making higher-level protocols very portable.

The TCP/IP protocols are described in documents called RFCs (RFC stands for Request For Comment). The ultimate adoption of these as standards is governed by a body called the IETF (Internet Engineering Task Force).

Some of the Protocols in the Family

IP – Internet Protocol – RFC 791 [1] (version 4) and RFC 2460 [2] (version 6). Uses an “IP address” as a logical identifier. It is designed for routing across subnets. A very simple mechanism (using the “IP mask”) identifies the subnet. IP is not attached to any specific hardware. Version 4 is almost exclusively used today; version 6 is slowly gaining ground.
ARP – Address Resolution Protocol – RFC 826 [3]. Used to translate IP addresses to Ethernet addresses. The translation is performed with a table look-up, the ARP table is filled “as needed” by the ARP protocol.
UDP – User Datagram Protocol – RFC 768 [4]. UDP is a connectionless datagram delivery service that does not guarantee delivery. It does not maintain an end-to-end connection with the remote UDP module; it merely pushes the datagram out on the net and accepts incoming datagrams off the net.
TCP – Transmission Control Protocol – RFC 793 [5]. TCP offers a connection-oriented byte stream. It is a sliding window protocol with time-out and retransmits. In contrast to UDP, TCP guarantees delivery. In the automation context, it must be stressed that TCP packetizes the byte stream at will; it does not retain the boundaries between writes.
DHCP – Dynamic Host Configuration Protocol – RFC 2131 [6]. DHCP’s purpose is to enable individual computers on an IP network to extract their configurations from a server (the “DHCP server”). DHCP consists of two components: a protocol for delivering node-specific configuration parameters from a DHCP server to a node and a mechanism for allocation of network addresses (IP addresses) to nodes. The overall purpose of DHCP is to reduce the work necessary to administer a large IP network.
TELNET – RFC 854 [7]. TELNET provides a remote login capability on TCP. The operation and appearance are similar to an ASCII terminal connected directly to a computer with a command-line interface. If the user types “telnet delta” on the command line, he receives a login prompt from the computer called “delta”.
FTP – File Transfer Protocol – RFC 959 [8]. File Transfer Protocol (FTP), as old as TELNET, also uses TCP and has widespread interoperability. The operation and appearance is as if you TELNETed to the remote computer. But instead of typing your usual commands, you have to make do with a shortlist of commands for directory listings and the like. FTP commands allow you to copy files between computers.
TFTP – Trivial File Transfer Protocol – RFC 1350 [9]. TFTP is a very simple protocol to transfer files, hence its name. It is implemented on top of UDP so it may be used to move files between machines on different networks implementing UDP. It is designed to be small and easy to implement, therefore it lacks most of the features of FTP. The only thing it can do is read and write files from/to a remote server.
SNMP – Simple Network Management Protocol – RFC 1157 [10]. Simple Network Management Protocol (SNMP) uses UDP and is designed for use by central network management stations. The central station uses SNMP to collect this data from other computers on the network. SNMP defines the format for the data; it is left to the central station or network manager to interpret the data.

Identifying the Units – Basic Addressing

Ethernet and Wireless LAN nodes running TCP/IP all share the same addressing mechanisms:

They have a unique hardware address assigned at the time of production (MAC address).
They get a unique logical address (IP address) assigned as part of the system set-up or at boot time.

The IP protocol is only concerned with the logical address of the nodes, using a specialized protocol (ARP – the Address Resolution Protocol) to map logical addresses to physical addresses when necessary. (IP version 6 does not need ARP as it contains an auto-configuration mechanism to provide an implicit generation of the device part of the logical address directly from the physical address.)

Three identity types for automation devices

In an automation or measurement system, there will probably exist many identical nodes with similar or identical tasks to handle but located at different places in a task hierarchy. In this case, it will be impossible for the configuration software to tell these devices apart (it cannot see which of the identical units sits at the start or end of a conveyor belt for example). Figure 1 shows a part of such an automation network with three identical nodes. It also shows three distinct identities for one of the nodes. Those are:

The MAC address. This identity is uniquely associated with the network interface. If the module is exchanged for another, the MAC address will change.
The IP address. This identity is associated with the network configuration. If the module is exchanged for another, the IP address may or may not change.
The automation identity. This identity is associated with the function of the device. If the module is exchanged for another, the automation identity must remain constant.

Problems with classical Fieldbus addressing

In the classical Fieldbus scenario, no MAC address existed; the network address was set by hand and also used for the automation identity. In such a scenario, the systems engineer had to keep track of everything manually. Usually, he started by creating a “map” of the overall system, assigned identities to the various nodes on the map, and, carrying the map along, saw to it that the real nodes were assigned network addresses equal to the identities indicated by the map. When the hardware addresses are fixed and logical addresses are automatically assigned, we cannot use this approach.

A better approach would be to attach an automation identity to each device. This could be done by entering this description into non-volatile storage, or by physically attaching an automation identity source to each device site (remember that the automation identity is not associated with the device but with the purpose of the device). This identity could be used for automatic configuration purposes. We shall return to this subject a little later.

Use Names Instead of Addresses!

The problem with Fieldbus addresses and automation identities is that they carry no meaning. The automation systems engineer might remember that device 27 on one Fieldbus is a combined flow meter and temperature sensor, but mostly it has to be looked up in the system documentation (assuming that it has been written and can be found). A descriptive name, however, like “main_flowmeter” would help make the control structure more understandable.

Device reference table and middleware

Of course, such names can be used today. Textually substituting “27” for “main_flowmeter” is something most system compilers can easily handle. The disadvantage with this method is that the name does not appear anywhere in the controller and cannot be used for anything. What we propose instead is to introduce a table of device references and a small middleware layer into the automation protocol stack. Every entry in this table should contain three entries: The device name, the automation device identity, and the IP address. Thus, whenever the controller wants to talk to automation device 27, the middleware looks in the table and retrieves the IP address of the device. If the IP address is not in the table, the middleware uses a name server to fetch the IP address, using the device name.

The DNS System and Protocol

The Domain Name System (DNS, [11]) uses hierarchical names and name servers. Each name server knows its domain name and the names of the devices inside it. Any request for a local name is resolved locally; other requests are passed along to the next level. The levels are separated by dots, just as in Internet site names. Thus, if your company is referred to as bigcomp and the flow meter is in hall 4 of the New Jersey factory, every controller in your company should be able to access the flowmeter using the name “main_flowmeter.hall4.nj.bigcomp.” Inside the hall4 subdomain, it is enough to refer to main_flowmeter. Remember, however, that these names are your internal names, you are not allowed to show them on the Internet (no automation device should ever be visible on the Internet, due to the risk of sabotage by hackers and virus authors).

If every device is given a name corresponding to its automation identity, it should routinely register with the local DNS server when the startup and configuration phase is finished. That way, every controller would be able to access the device without any knowledge of networking details. The only snag is: how does the device get hold of an IP address for itself and how does it get hold of the DNS server IP address?

Automate Network Administration with DHCP

Required IP parameters

When the TCP/IP protocol suite is used to handle end-to-end message transfers, every node must be configured with a set of parameters:

The IP address of the node
The IP subnet mask
The IP address of the master controller

Drawbacks of manual configuration

A simple solution is to program these parameters into each node using some sort of configuration tool. This will usually work satisfactorily on smaller systems, but the procedure has several drawbacks:

It involves copying a lot of meaningless information and is therefore error-prone
Whenever a defective module is exchanged, the configuration procedure must be repeated for the new module

In-office automation networks, such manual procedures have long been discarded, and automatic procedures have been introduced. There is no reason why automation systems should not use the same procedures.

The DHCP Protocol Supplies Useful Information

A DHCP session is run very infrequently, and the interval can be configured. It is always run when a device powers up. The information supplied in a standard DHCP session is the IP address the node may use and some other IP parameters. The protocol may also furnish some additional information, which may prove very useful for automation purposes:

A boot server hostname
A boot file name (which is supposed to be located on the boot server)
The IP address of the local Domain Name Server

A Possible Startup Sequence

Using the protocols described above, we are in the position to create a startup sequence for a “plug and play” automation system:

After power-on and self-test, run DHCP. The DHCP session should provide the server hostname, the boot file name, and the Domain Name Server address.
Register with the Domain Name Server.
Use the Domain Name Server to get hold of the IP address of the boot server.
Download the boot file using TFTP.
Check the revision number of the latest firmware against the current revision number. If the current revision is older, then download the latest firmware and store it in program memory.

From this point on, the procedure depends on the high-level protocol used. Some protocols expect the devices to keep still and wait until they are polled, others may want the device to contact the controller.

Other Useful Protocols and Services

SNMP Helps Network Administrators

The Simple Network Management Protocol is an application-layer protocol designed to facilitate the exchange of management information between network devices. By using SNMP-transported data (such as packets per second and network error rates), network administrators can more easily manage network performance, find and solve network problems, and plan for network growth.

Today, SNMP is the most popular protocol for managing networks. SNMP-related standardization activity continues even as vendors develop and release state-of-the-art, SNMP-based management applications. SNMP is a relatively simple protocol, yet its feature set is sufficiently powerful to handle the problems presented in trying to manage today’s networks.

IGMP Manages Network Groups

The publish/subscribe model allows independently developed distributed applications to be able to exchange information in an event-driven manner without needing to know the source of the data or the network topology. Information producers publish information anonymously. Subscribers anonymously receive messages without requesting them.

Like other broadcast-based models, publish/subscribe is network-efficient on hub-based systems. For example, if the cost of energy changes in a distribution system, only a single transmission is required to update all of the devices dependent on the energy price (This is of course in the best or most optimistic case).

IGMP [12] is a protocol for devices to use when they want to join or leave multicast groups. By sending a Membership Report to its immediately neighboring router, a device informs the router that it wishes to become part of a multicast group. Routers periodically transmit Membership Query messages to determine which host groups have members on their directly attached networks.

Conclusion: The Tools Are Available – Use Them!

The protocols and services we have been discussing are neither resource-hungry nor in any way real-time. They typically run when a device is powered up for the first time, and not very often after that. In addition, they do not interfere with existing automation protocols, and should therefore be ideal candidates for the first step towards a true “plug and play” automation system.

References

[1] RFC 791, “Internet Protocol”. Online at http://www.ietf.org/rfc.html.

[2] RFC 2460, “Internet Protocol, Version 6 (IPv6) Specification”. Online at http://www.ietf.org/rfc.html.

[3] RFC 826, “Ethernet Address Resolution Protocol: Or converting network protocol addresses to 48.bit Ethernet address for transmission on Ethernet hardware”. Online at http://www.ietf.org/rfc.html.

[4] RFC 768, “User Datagram Protocol”. Online at http://www.ietf.org/rfc.html.

[5] RFC 793, “Transmission Control Protocol”. Online at http://www.ietf.org/rfc.html.

[6] RFC 2131, “Dynamic Host Configuration Protocol”. Online at http://www.ietf.org/rfc.html.

[7] RFC 854, “Telnet Protocol Specification”. Online at http://www.ietf.org/rfc.html.

[8] RFC 959, “File Transfer Protocol”. Online at http://www.ietf.org/rfc.html.

[9] RFC 1350, “The TFTP ProtocolRevision 2)”. Online at http://www.ietf.org/rfc.html.

[10] RFC 1157, “Simple Network Management Protocol (SNMP)”. at http://www.ietf.org/rfc.html.

[11] RFC 1035, “Domain names – implementation and specification”. Online at http://www.ietf.org/rfc.html.

[12] RFC 1112, “Host extensions for IP multicasting”. Online at http://www.ietf.org/rfc.html.

Svein

Master’s in Mathematics, Norway. Interested in Network-based time synchronisation.

10 replies

Greg Bernhardt says:
May 19, 2016 at 8:27 pm
Interesting Insight!
Log in to Reply
JakeBrodskyPE says:
May 19, 2016 at 8:27 pm
“what happens if you have a heart attack and end up in hospital? Can others figure out what to do in case of a breakdown or will the whole thing stop because you are the only one who knows the system (and you are not available).”
No, things don’t go to a screeching halt. We don’t design our control systems that way. In any case, we have a staff of 10 control systems engineers. We cover for each other and we cross over various disciplines and projects. Obviously, if those on call have questions, they’ll reach out to the person who touched a system last. If that person is not available, others will step in, but recovery will probably take longer.
There are no critical staff in our group.
Log in to Reply
Svein says:
May 19, 2016 at 8:27 pm
I understand your frustration, it matches the stories I have heard from others (I was involved in ABB R&D for 15 years). And, believe me, the fieldbus guys were even more frustrated. Whichever fieldbus they were trying to standardize on, the customer had a different flavor. And if by chance the customer had a fieldbus the installation guys knew about, the system documentation were always missing/incomplete/out of date.
What we did way back when was to tell those people that they could standardize on Ethernet hardware – inexpensive and readily available. And the very cheapest Ethernet (the 100Mbit/s version) was way faster than the fastest fieldbus.
Ethernet as a hardware concept caught on, but immediately every vendor wanted to run a proprietary protocol on top of the Ethernet MAC layer – and as far as I know they are still at it. TCP/IP was nobody’s choice (it was never meant to be an automation protocol), but it was the only thing that worked. The trouble was that the fieldbus guys still wanted to configure everything by hand, and could not understand that an Ethernet controller came with a built-in hardware address.
My purpose in writing this small guide was to tell those fieldbus guys: Hey, if you are running an automation protocol on TCP/IP, there are several tools available that can make life a bit easier for you!
“The other Operators, Technicians, and Engineers know my phone numbers. If something breaks and they think I had something to do with it, I’ll hear about it. I go on 24 hour call for one week every month and a half.”Oops – what happens if you have a heart attack and end up in hospital? Can others figure out what to do in case of a breakdown or will the whole thing stop because you are the only one who knows the system (and you are not available).
Log in to Reply
JakeBrodskyPE says:
May 19, 2016 at 8:27 pm
All I’m going to say Svein, is go ahead and try living with this stuff. I have spent most of my career living with my creations through the entire life-cycle. The other Operators, Technicians, and Engineers know my phone numbers. If something breaks and they think I had something to do with it, I’ll hear about it. I go on 24 hour call for one week every month and a half.
No, I don’t hard-wire the ARP tables except in extreme cases. But I have done it to get around poor behavior in embedded devices.
Let me tell you a little story from over 25 years ago. I recall, as a young technician going to school at night, a consulting programmer who visited us to help commission our SCADA system (in the mid 1980s). He told us of the UCA effort and how it was going to unify the SCADA world. Everything would have objects and those objects would be standard, and it would all just drag and drop in to a new nirvana of design and ease of maintenance. Well, we waited, breathlessly at first, and then curiously, and then we sort of forgot about it. Yes, they’ve been at it for a long time.
The IEC 61850 object model and the MMS/GOOSE transport has a major problem to overcome: The object definitions should match the devices in the field, NOT THE OTHER WAY AROUND! Yet, that’s exactly what 61850 tries to do. And that’s why, decades later, they’re only just starting to see some meager traction. It works if you have a green-field, clean sheet design. But that’s not what most users have. And if you try to mix GE, ABB, or Siemens gear, you’ll be in for a rather unpleasant surprise. It’s not as plug-and-play as it might first seem. I wish I could say something nicer about it because I would really like to have seen this work. But after decades of effort, they still don’t have a lot to show for it –and I’m pretty sure that it’s because they have firmly planted the cart in front of the horse. Designing hardware to meet a software goal is a non-starter in most engineering houses. Go try it some time and let me know how well it works.
If you’d like to see a sample of the problems the 61850 standard has faced, read the verbiage on the compliance certificate: “This device has not been shown to be incompatible with the standard.” Doesn’t that inspire confidence?
By the way, this same stumbling block is one of the reasons the FieldBus standard got stalled in committee for about 15 years.
Frankly, I’m tired of kids in cubicles who think a pump is an object. I’ll slap a hard-hat on their heads, some steel toed boots on their feet, and drag them out to a pumping station and show them the subtle differences of each pump in the station. No, they’re not easily summarized by objects. These things grow and morph, and change as the years go by. Unless you install and commission them all at the same time, these objects do not copy directly. They cannot be standard. A pump of brand X from 10 years ago is not the same as pump Y from two years ago, or pump Z from 50 years ago. That’s the fallacy of trying to build standard objects.
Yes, I’ve used object models to handle identical devices in new installations. It saves some commissioning time there. But that’s a far cry from the one true object model of everything.
I prefer simplicity. [U]K[/U]eep [U]I[/U]t [U]S[/U]imple, [U]S[/U]tupid[U]![/U]
When you have to live with your creations in a 24/7 environment year after year, you too will understand.
Log in to Reply
Svein says:
May 19, 2016 at 8:27 pm
“I do not use DNS for naming ANYTHING in a machine to machine network.
1. It is a central point of failure. The rest of the network can be seriously bollixed up from the latency of looking something up on DNS –especially if the latency of the DNS goes up for some reason. This is a real time network and milliseconds matter. I hard-code IP addresses.
2. The value of DNS is that if the addresses change, the DNS can update the rest of the world. Well, I don’t want the rest of the world poking in this real time network. Real time networks tend to be very static. They don’t change for YEARS.”This is, of course, an option. The trouble doesn’t surface until one device breaks and you have to replace it. What IP address should the new device have (you have carefully documented every device in your network, of course) and how do you connect it to the automation/measurement controller program?
“I am very cautious about ARP. Many routers and switches have ridiculous ARP timeouts of HOURS. If I swap out a broken controller, I don’t want to wait more than a couple minutes for the ARP protocol to forward the addresses across the network. Many embedded devices do it badly too. I am not above setting static ARP caches for an industrial control system.”Again, if you have to replace a broken device, it will have a new Ethernet address (if you are not going the “locally administered Ethernet Address” way). So you have a new device, you hardcode the same IP address – but without ARP, the device will never answer.
“The advantage of DHCP is that anyone can join the network with minimal effort. Well, I don’t want just ANYONE to join my real time networks.”That is not a problem, You can easily insert Access Control Lists in a DHCP server, ensuring that only your approved devices get a valid IP address.
“Simple Network Management Protocol isn’t.”Agree. But there has been several revisions to SNMP with updates and corrections (I think the latest is rfc 3411). It is steadily getting better.
You might be interested in the IEC 61850 standard ([URL]https://en.wikipedia.org/wiki/IEC_61850[/URL]). This is a standard for managing large measurement and automation networks.
Log in to Reply
JakeBrodskyPE says:
May 19, 2016 at 8:27 pm
I’ve seen numerous attempts to set up Plug-and-Work applications for industrial control systems. Damned few of them have ever served me well. There are too many ways that it can fail or do unexpected things.
I do not use DNS for naming ANYTHING in a machine to machine network.
1. It is a central point of failure. The rest of the network can be seriously bollixed up from the latency of looking something up on DNS –especially if the latency of the DNS goes up for some reason. This is a real time network and milliseconds matter. I hard-code IP addresses.
2. The value of DNS is that if the addresses change, the DNS can update the rest of the world. Well, I don’t want the rest of the world poking in this real time network. Real time networks tend to be very static. They don’t change for YEARS.
I am very cautious about ARP. Many routers and switches have ridiculous ARP timeouts of HOURS. If I swap out a broken controller, I don’t want to wait more than a couple minutes for the ARP protocol to forward the addresses across the network. Many embedded devices do it badly too. I am not above setting static ARP caches for an industrial control system.
Simple Network Management Protocol isn’t. It looks like a bunch of Computer Science professors sat around a table, ignoring everything that came before in SCADA systems design and then took some really cool ideas and implemented them very badly. The object tree was a great start. But ASN.1 was a disaster. It had no concept of an event, either. SNMP could have been much nicer that it is.
I don’t use DHCP for automation either. DHCP takes too much time to address equipment after a power failure. Furthermore, the equipment is supposed to stay up for weeks and months at a time. Again, these are very very regular operations on very unchanging static network.s The advantage of DHCP is that anyone can join the network with minimal effort. Well, I don’t want just ANYONE to join my real time networks. I keep my subnets small, and I control my address spaces with a tight fist. Everything is alarmed. Light up the wrong port and there will be a big yellow banner across every operator station that somebody is working on the network that needs to be watched.
Frankly, the Internet Protocol suite is a poor fit for industrial control applications. I use it because the software has been battle tested, because people think they understand it, and it gives them a warm fuzzy feeling. Unfortunately, the reality is that very few understand it well enough to diagnose anything. I’ve seen it fail in ways that make most IT networking guys wonder what hit them. I don’t like complexity like this. I prefer to keep things as simple as possible. Use as few of the RFC suite protocols as you need to get the job done. Watch your latency, and beware of dependencies and single points of failure.
I have much more to say about this, and in fact, I’ve written chapters and edited a book on the subject. The advice that Svein offers here is well intentioned, but unintentionally toxic in some situations.
Jacob Brodsky, PE
Log in to Reply
Svein says:
May 19, 2016 at 8:27 pm
“Two: I seem to remember that Ethernet wasn’t qualified for real-time solutions because of this collision detection / backing off mechanism (CSMA/CD): you can’t guarantee a time limit for getting the message across. So a lot of manufacturers developed buses that had some kind of interrupt level facility. Is that still an issue ?”No, it is not an issue today. Switched Ethernet does not use CSMA/CD. Instead, the switches receive the packets and put them in the queue for the destination node. The only way you can lose packets is to have a massive overload of the network (meaning that the link to the destination node is busy all the time and the output queue is full).
In addition, IEEE 802.1D defines a mechanism for packet priority and virtual private networks. Packet priority is especially relevant for automation and measurement networks.
Log in to Reply
jedishrfu says:
May 19, 2016 at 8:27 pm
Nice insight article!
Log in to Reply
BvU says:
May 19, 2016 at 8:27 pm
Two things: One: figure 1 didn’t make it.
Two: I seem to remember that Ethernet wasn’t qualified for real-time solutions because of this collision detection / backing off mechanism (CSMA/CD): you can’t guarantee a time limit for getting the message across. So a lot of manufacturers developed buses that had some kind of interrupt level facility. Is that still an issue ?
Log in to Reply
Greg Bernhardt says:
April 2, 2016 at 4:32 pm
Interesting Insight!
Log in to Reply

Want to join the discussion?
Feel free to contribute!

TCP/IP Protocols for Plug-and-Play Industrial Automation