How many questions are we to develop for the two exams?

For adequate reliability, you should have approximately 120 items or so in each form of the exam.  Note there should be 30-40 anchor items which appear on both forms.  In addition, there should be about 25 trial (unscored) items in each form to build up an item pool of parameterized items.