Context-awareness refers to the ability of a system to adapt and respond proactively to the changes in the user's situation. The multitude of data generated by sensors available on users' mobile devices, combined with advances in machine learning techniques, support context-aware services in recognizing the current situation and optimizing the personalization features. However, their performances mainly depend on the accuracy of the context inference process, which is strictly tied to the availability of large-scale and labeled datasets. In this work, we present a framework developed to collect datasets containing heterogeneous sensing data derived from personal mobile devices and characterizing the user physical context. The framework has been used by 3 voluntary users for two weeks, generating a dataset with more than 36K samples and 1331 features. Starting from this dataset, we also propose a lightweight approach to model the user context to speed up the context reasoning process and perform the entire computation on the local mobile device. To this aim, we used six dimensionality reduction techniques to reduce the feature space and optimize the context classification. Four experiments with three well-known classifiers and the dimensionality reduction techniques show that, even with a limited dataset, we achieve a 10x speed up and a feature reduction of more than 90% while keeping the accuracy loss less than 3%.