Urban traffic flow prediction plays a crucial role in smart city management. Since the flow volume of one road (treated as a node in a traffic network) in future time units (after time t) depends on the historical volume (before and including time t) of this road (node) itself and its neighboring roads (nodes), the traffic flow prediction problem has recently been studied by utilizing a spatial-temporal adjacency matrix (AM) of traffic nodes constructed from the historical traffic and node connections. The construction of AM is often based on statistical traffic information before t, instead of using the volume at individual time unit level. In addition, the spatial and temporal relations between traffic nodes are manually fused in AM, rather than in a trainable fusion. In order to conquer these issues, we propose a trainable context enhanced similarity graph, which fuses the unit-level similarity of traffic time series and multiple inter-node contextual relations through a learnable embedding model. In addition, a Quotient Neural Network is proposed to perceive the explicit relation among short-memory flow values and facilitate the forecasting. Based on the two modules, we propose a novel Quotient Graph Neural Network (QuoGNN). Experiments on four real-world benchmark datasets demonstrate the superior performance of our proposed model over the state-of-the-art baselines on multiple evaluation metrics. The implementation of the model and datasets are available 1 .