In this paper, we present a probabilistic approach to learning task-specific and shared representations in CNNs for multi-task learning. We propose stochastic filter groups (SFG), a mechanism to assign convolution kernels in each layer to specialist or generalist groups, which are specific to or shared across different tasks, respectively. The SFG modules determine the connectivity between layers and the structures of task-specific and shared representations in the network. We employ variational inference to learn the posterior distribution over the possible grouping of kernels and network parameters.