payelb/UltraFeedback_openbmb_TinyLlama-1.1B_aligned_with_semantic_MARS_deberta_RM Updated 27 days ago
payelb/UltraFeedback_openbmb_TinyLlama-1.1B_aligned_with_semantic_MARS_RM_roberta_semantic_MARS_RM Updated 28 days ago
payelb/UltraFeedback_openbmb_roberta-large_1k_fixed_MARS_semantic_refined Text Classification • 0.4B • Updated 28 days ago • 48
payelb/PKUSafeRLHF_roberta-large_1k_fixed_MARS_semantic_refined Text Classification • 0.4B • Updated 28 days ago • 41
payelb/PKUSafeRLHF_TinyLlama-1.1B_aligned_with_semantic_MARS_RM_roberta_semantic_MARS_RM Updated 28 days ago
payelb/HHRLHF_roberta-large_1k_fixed_MARS_semantic_refined Text Classification • 0.4B • Updated 28 days ago • 48
payelb/HHRLHF_TinyLlama-1.1B_aligned_with_semantic_MARS_RM_roberta_semantic_MARS_RM Updated 28 days ago
payelb/UltraFeedback_openbmb_Llama-3.2-1B_aligned_with_baseline_roberta_RM_KLsafe Updated 29 days ago
payelb/UltraFeedback_openbmb_Llama-3.2-1B_aligned_with_semantic_MARS_roberta_RM_KLsafe Updated 29 days ago
payelb/HHRLHF_roberta-base_1k_fixed_MARS_semantic_distance_synth Text Classification • 0.1B • Updated May 8 • 11
payelb/PKUSafeRLHF_reward-model-deberta-v3-base_1k_fixed_MARS_semantic_refined Text Classification • 0.2B • Updated May 7 • 9
payelb/PKUSafeRLHF_roberta-base_1k_fixed_MARS_semantic_refined Text Classification • 0.1B • Updated May 7 • 11
payelb/UltraFeedback_openbmb_roberta-base_1k_fixed_MARS_semantic_refined Text Classification • 0.1B • Updated May 6 • 8
payelb/UltraFeedback_openbmb_reward-model-deberta-v3-base_1k_fixed_MARS_semantic_refined Text Classification • 0.2B • Updated May 4 • 8
payelb/HHRLHF_reward-model-deberta-v3-base_1k_fixed_MARS_semantic_distance_synth Text Classification • 0.2B • Updated May 2 • 12
payelb/HHRLHF_reward-model-deberta-v3-base_1k_fixed_MARS_semantic_refined_aug26 Text Classification • 0.2B • Updated May 2 • 5
payelb/HHRLHF_roberta-base_1k_fixed_MARS_semantic_refined_aug26 Text Classification • 0.1B • Updated May 2 • 4