The
subnetwork entry in Wikipedia describes this in some detail.
Just a quick overview (ignoring IPv6 for simplicity, and assuming an understanding of binary numbers):
An IP address is a 32 bit number, represented for convenience as groups of 8 bits (in decimal rather than hex for some odd reason).
In a local network, the leading bits are the
network prefix (ie, they identify the subnet), and the trailing bits are the
host address (ie, they identify each computer on the subnet). The
netmask is just an IP address in which the leading bits are all set to 1 (so it can be used to mask out the network prefix from the full IP address).
Since the trailing bits are the host address, that is what determines the number of available IP addresses in the subnet. That means if there are N host address bits, then there are 2^N-2 possible host addresses. The '-2' is because 2 of the addresses are reserved; one of the reserved addresses is the
broadcast address, in which all host address bits are set to 1.
Take, for example, a typical local network IP address of 192.168.10.3. The binary representation of this is:
11000000 10101000 00001010 00000011
This is a typical class C address, meaning that the network prefix is probably the first 24 bits, ie
11000000 10101000 00001010
(which could also be written as 192.168.10.0/24)
And the network mask is just a bitmask that masks those first 24 bits, ie:
11111111 11111111 11111111 00000000
(also written as 255.255.255.0, or just /24)
The host address is the remaining 8 bits, ie:
00000011
Since there are 8 bits for the host address, this means there are 2^8-2 possible addresses, or 254 addresses, on the subnet. And the broadcast address for this subnet is where the host address bits are all on, ie:
11000000 10101000 00001010 11111111
(which can be written as 192.168.10.255)
Note that even though the parts of the address are broken up on an 8 bit boundary here, they can actually be broken at any point, as defined by the netmask.