Return to Memolist MMA Memo 167: High Level Computing Information Flow for the MMA

MMA Memo 167: High Level Computing Information Flow for the MMA

M.A. Holdaway
National Radio Astronomy Observatory
949 N. Cherry Ave.
Tucson, AZ 85721-0655
email: mholdawa@nrao.edu

April 16, 1997

Abstract:

We map out a straw man concept of the information flow in the computing system for the MMA. We track the information from the proposal stage through refereeing, time allocation, choosing a project from the queue, micro-scheduling the project, flexible queue observing, automated pipeline imaging of preliminary images, automated pipeline imaging of the final images, and archiving. If each of these steps can be accomplished fully electronically, we can design a highly integrated system which is capable of meeting the MMA computing demands which have been laid out by the MMA Computing Working Group (MMA Memo 164).

Organizing Principles

The MMA computing system needs to greatly exceed the capabilities of any radio astronomical facility yet built. In order to optimize the scientific throughput of the instrument, we envision flexible scheduling (scheduling a project which can meet its scientific goals in the current weather conditions), flexible micro-scheduling (changing the calibration strategy, integration time, and possibly postponing the project as the weather conditions change during the observation), and automated imaging of the data with the use of an imaging pipeline. These goals sound very challenging, especially in the computing environments of existing telescopes. In order to meet these goals, we must design a computing system from start to finish with these goals in mind. Two guiding principles help us in this design:

We require that at all steps the information must be in the computer and never primarily as a hard copy.
We seek whenever possible to automate procedures which can be described algorithmically.

In spite of these principles, some human input, guidance, and oversight is required. The main purpose of the high level information flow we describe here is to show where we need information from humans, where we need information from machines or computer programs, what information will come out of the system, and what algorithmic tools will need to be developed in the future. We merely hint at the types of information and algorithmic tools which will be required, leaving a more detailed look at the problem for a later time.

Highest Level Information Flow

At the highest level, the information flow of the MMA is simple: the proposer asks to observe a source, the time allocation committee (with the referees' help) decides how much time to give the project and may possibly modify the scope of the project, the successful project source is observed by the telescope, the telescope produces data, the data are calibrated and imaged, and the images are sent back to the astronomer. Variations of this general scheme involve some amount of input from the astronomer during the observations and the calibration and imaging stages. We intend to design a system which allows for astronomer involvement ranging between the extremes:

the astronomer sets the specifications for the observations in the proposal and after several months gets images in the mail.
the astronomer requires control at all points of the process: in project queuing, in micro-scheduling of the project, and in calibration and imaging. (If too successful, this astronomer may be hired by NRAO to upgrade the algorithmic software for automated observing and imaging.)

Proposal Handling and Time Allocation

This section deals with the flow of information from the preparation and submission of the proposal to the acceptance of the proposal into the observing queue, as illustrated in Figure 1. (In these diagrams, people or groups of people are represented by ellipses, files or databases are represented by rectangles, and algorithms or machines which generate or eat information are represented by flattened hexagonal shapes.) We describe the steps in the proposal and time allocation process:

The proposer generates the observing proposal (OP, which also stands for observation parameters). Current proposals often aim to get most of the information which is required to perform the actual observations, such as source position, observing frequency, correlator setup, and required noise level, but are not in a useful format. If the observation parameters of the proposal are in the correct machine readable format, we can literally give the proposal to the telescope and ask it to observe. The general strategy for the proposals will be for scientific requirements, such as noise level and dynamic range, rather than observing time required. In addition, environmental requirements, such as maximum acceptable rms residual phase error level or opacity, could be specified. Some tools will assist the proposer in generating the proposal:
- A proposal generating tool (think of a forms interface on a web page for starters) will ensure that the required information is provided in the correct format.
- A proposal viewing tool will convert the native proposal information into a convenient viewing format such as postscript or HTML.
- A time estimation tool, complete with MMA sensitivities and site characteristic statistics, will assist in estimating the time required to meet the desired scientific goals (ie, noise level).
The proposer electronically sends the proposal to the NRAO Director's office. NRAO may choose to accept paper proposals. If so, NRAO would need to employ someone to insert the proposal into the appropriate computer readable form.
NRAO sets up a database of observing proposals (Set of OP's) resident on an NRAO computer. This database will have limited access.
The director's office informs the referees the Set of OP's is complete.
Referees view the OP's over the internet.
Referees generate comments and suggested changes to the OP's. For example, in addition to comments about the general merit of the observing proposal, a referee might indicate that the desired noise level is unrealistic or not necessary. The general form of the comments will mirror that of the OP, though most items will contain no information unless the referee explicitly thinks a change in the observation parameters is required. A software tool will help the referees input their comments and will automatically input them into the limited access Set of OP's database.
When the referees' comments and suggested modifications are in place, the Time Allocation Committee (TAC) reviews all of the proposals.
The TAC makes its own comments and modifications to the observation parameters, and....
...assigns each project a queue priority.
When the TAC has completed its job, the Observing Queue database is generated from the Set of OP's database. The modifications to the observation parameters decided upon by the TAC are applied to the observer's original observation parameters, and the OP's are ranked by priority in the queue. There will be a software tool which converts the Observing Queue database into an easily viewed form which can be publicly accessed via the internet. Some information may not be available for public viewing, such as the referee's comments or the text of the scientific justification. The Observing Queue is for use both within and outside NRAO.
When the Observing Queue and its easily viewed form are complete, the TAC runs a program that goes through the Set of OP's and sends the appropriate referee comments and the OP's priority to each proposer. The proposer is also instructed to look at the newly generated Observing Queue and is advised about when the OP is likely to be observed.
The proposer checks the Observing Queue for herself.
The proposer decides that the observation parameters she originally proposed are not quite correct, or that she is unhappy with some modification implemented by the TAC. The proposer can modify her OP in the observing queue, via an NRAO staff scientist who verifies that the change will not significantly alter the estimated observing time. Such changes could occur right up to, or even in the middle of, observations.

Projection Scheduling, Observing, and Automated Imaging

At this point, the Observing Queue is all set for dozens or even hundreds of observations. Each OP in the Queue has a set of complete specifications which enable the NRAO to prepare a micro-schedule (ie, the detailed minute by minute schedule). This section goes through the stages of picking an appropriate project from the queue, scheduling, observing, and automated imaging, which will feed back into the queuing and the scheduling. This process is illustrated in Figure 2

The physical environment of the MMA will affect observations in many ways, and some conditions will not permit successful completion of certain very demanding observations, but will permit other less demanding OP's without difficulty. In principle, we should be able to determine if a given set of environmental conditions would allow an OP to be observed successfully. To aid in this determination, we will need a series of environmental monitors, such as
- one or more tippers measuring opacity at various frequencies.
- one or more phase stability monitors measuring phase fluctuations at various sites about the array.
- one or more weather stations measuring, among other things, the wind speed and the wind gustiness, which will be used to estimate the pointing errors the antennas would experience.
- one or more sets of thermal probes on strategic structural members of the antennas to estimate the thermal contribution to the pointing errors and the surface errors.
- some monitor of the basic health of the array.
The information from these environmental monitors would affect the selection of an OP from the Observing Queue, the scheduling of that OP, and the termination of that OP upon changing environmental conditions.
An automated picking algorithm looks at the Observing Queue and selects the highest priority currently observable OP which does not have a hold on it (the observer may observe a small part of the project and then put a hold on that OP until she has verified the observations are proceeding as expected) which can likely be successfully completed with the current environmental conditions.
The scheduler uses the output from the environmental monitors and the selected OP to generate the detailed telescope instructions that those familiar with the VLA might call an ``observe file''. The environmental monitors will affect details such as how often the phase calibrator is observed, or if pointing calibration is performed.
The detailed telescope instructions are passed on to a telescope control computer. This stage is much more complicated than shown, but these functions are already operating on the VLA and many other radio astronomical instruments, so we put them into a black box.
Astronomical and monitor data gets filled directly onto disk, summarized and placed in a global summary observational database, and archived onto an archiving medium.
The imaging pipeline software reads the data from disk and calibrates and images some subset of the data to produce preliminary images. Imaging parameters (ie, weighting scheme, cell size, deconvolution algorithm) are specified in the OP. The data subset (ie, what channels, resolution) and how often the subset is imaged are also specified in the OP, or else sensible default values are used.
The preliminary images are evaluated, and the results of the evaluation are fed back into the scheduler and to the proposer. Under some circumstances indicated by the OP, the project will be put ``on hold'' after a certain condition is met. When an OP is on hold, it is sent back to the queue in an inactive state, and the proposer is informed of the OP's status.
The proposer needs to interact with the queue to reactivate the OP.
At points specified in the OP (or by default, when the OP has been completed), the imaging pipeline will produce the ``final'' images using imaging parameters specified in the OP. The ``final'' images are archived and sent to the proposer. In many cases these ``final'' images will indeed be limited by thermal noise, and no further data processing will be required. In more demanding cases, the images will require further processing before analysis can proceed. We expect that as the MMA matures, the fraction of projects for which the pipeline produces acceptable final images will increase.
Operations marked with the small triangle could be confirmed by a staff scientist to ensure that the system is running smoothly. Human override of the automated system is also possible at this points, and the observer could, in principle, perform these operations herself.

Some kinds of observations, such as VLBI, do not lend themselves to queued observations. We therefore must also maintain the possibility of fixed time observations. An observer who is observing with nonstandard or modes which are not fully supported may also wish to use a fixed time observation to make observer interaction over the internet more convenient.

There are a number of pieces of software of varying degrees of complexity which must be written to make this system work. In addition to writing the required software, a lot of work must go into understanding the site and the instrument and how well the observations proceed under various conditions. And finally, the TAC will need to pay close attention to how projects are queued. There may be peculiarities of the logic of the picker which are not immediately obvious. Either the picking algorithm or the priorities assigned by the TAC may need to be modified.

References

Scott, Steve, Emerson, D., Fisher, R., Holdaway. M., Knapp, J., Mundy, L., Tilanus, R., Wright, M., 1996, MMA Memo 164: ``MMA Computing Working Group Report''.

Postscript version of Figure 1 may be easier to read.

Postscript version of Figure 2 may be easier to read.