Many previous studies showed that some nodes should be in a sleeping state while the traffic is high. These nodes can wake up periodically to transmit data while they have data to transmit. The system throughput will be increased while each node can change between wake-up and sleeping states periodically. If a node can estimate the active rate and wake up at the optimal time, the collision probability decreases. This study proposes a Q-learning-based distributed queuing medium access control (QL-based DQMAC) protocol for Internet-of-Things (IoT) networks. In the proposed QL-based DQMAC, we derive the optimal number of contention IoT nodes. Each node calculates the active rate by itself through the Q-learning algorithm. Then, each node determines whether it will be active or in sleeping mode in the next contention period according to the active rate. Finding the optimal IoT nodes in each contention period decreases the probability of collision. The energy consumption due to the contention and delay for MAC contention is reduced owing to the lower number of contentions. Protocol comparison with other DQMAC protocols shows that the proposed QL-based DQMAC protocol achieves higher performance in IoT networks.